Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Advancing Spiking Neural Networks towards Multiscale Spatiotemporal Interaction Learning (2405.13672v2)

Published 22 May 2024 in cs.CV

Abstract: Recent advancements in neuroscience research have propelled the development of Spiking Neural Networks (SNNs), which not only have the potential to further advance neuroscience research but also serve as an energy-efficient alternative to Artificial Neural Networks (ANNs) due to their spike-driven characteristics. However, previous studies often neglected the multiscale information and its spatiotemporal correlation between event data, leading SNN models to approximate each frame of input events as static images. We hypothesize that this oversimplification significantly contributes to the performance gap between SNNs and traditional ANNs. To address this issue, we have designed a Spiking Multiscale Attention (SMA) module that captures multiscale spatiotemporal interaction information. Furthermore, we developed a regularization method named Attention ZoneOut (AZO), which utilizes spatiotemporal attention weights to reduce the model's generalization error through pseudo-ensemble training. Our approach has achieved state-of-the-art results on mainstream neural morphology datasets. Additionally, we have reached a performance of 77.1% on the Imagenet-1K dataset using a 104-layer ResNet architecture enhanced with SMA and AZO. This achievement confirms the state-of-the-art performance of SNNs with non-transformer architectures and underscores the effectiveness of our method in bridging the performance gap between SNN models and traditional ANN models.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (64)
  1. Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1–9, 2015.
  2. You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 779–788, 2016.
  3. Wolfgang Maass. Networks of spiking neurons: the third generation of neural network models. Neural networks, 10(9):1659–1671, 1997.
  4. Towards spike-based machine intelligence with neuromorphic computing. Nature, 575(7784):607–617, 2019.
  5. Loihi: A neuromorphic manycore processor with on-chip learning. Ieee Micro, 38(1):82–99, 2018.
  6. A million spiking-neuron integrated circuit with a scalable communication network and interface. Science, 345(6197):668–673, 2014.
  7. A new pre-conditioned stdp rule and its hardware implementation in neuromorphic crossbar array. Neurocomputing, 557:126682, 2023.
  8. Optimal conversion of conventional artificial neural networks to spiking neural networks. arXiv preprint arXiv:2103.00476, 2021.
  9. Progressive tandem learning for pattern recognition with deep spiking neural networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(11):7824–7840, 2021.
  10. Spatio-temporal backpropagation for training high-performance spiking neural networks. Frontiers in neuroscience, 12:323875, 2018.
  11. Going deeper in spiking neural networks: Vgg and residual architectures. Frontiers in neuroscience, 13:425055, 2019.
  12. Deep residual learning in spiking neural networks. Advances in Neural Information Processing Systems, 34:21056–21069, 2021.
  13. Advancing spiking neural networks toward deep residual learning. IEEE Transactions on Neural Networks and Learning Systems, 2024.
  14. Attention spiking neural networks. IEEE transactions on pattern analysis and machine intelligence, 2023.
  15. Tcja-snn: Temporal-channel joint attention for spiking neural networks. arXiv preprint arXiv:2206.10177, 2022.
  16. Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7132–7141, 2018.
  17. Temporal-wise attention spiking neural networks for event streams classification. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 10221–10230, 2021.
  18. Or residual connection achieving comparable accuracy to add residual connection in deep residual spiking neural networks. arXiv preprint arXiv:2311.06570, 2023.
  19. A cnn with multiscale convolution and diversified metric for hyperspectral image classification. IEEE Transactions on Geoscience and Remote Sensing, 57(6):3599–3618, 2019.
  20. Crossvit: Cross-attention multi-scale vision transformer for image classification. In Proceedings of the IEEE/CVF international conference on computer vision, pages 357–366, 2021.
  21. Multi-scale dense networks for resource efficient image classification. arXiv preprint arXiv:1703.09844, 2017.
  22. Infrared small target detection utilizing the multiscale relative local contrast measure. IEEE Geoscience and Remote Sensing Letters, 15(4):612–616, 2018.
  23. Stdmanet: Spatio-temporal differential multiscale attention network for small moving infrared target detection. IEEE transactions on geoscience and remote sensing, 61:1–16, 2023.
  24. Multiscale patch-based contrast measure for small infrared target detection. Pattern Recognition, 58:216–226, 2016.
  25. Fast multiscale image segmentation. In Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No. PR00662), volume 1, pages 70–77. IEEE, 2000.
  26. Probabilistic multiscale image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(2):109–120, 1997.
  27. Multiscale fusion of visible and thermal ir images for illumination-invariant face recognition. International Journal of Computer Vision, 71:215–233, 2007.
  28. A multiscale approach to pixel-level image fusion. Integrated Computer-Aided Engineering, 12(2):135–146, 2005.
  29. Multiscale representation learning for image classification: A survey. IEEE Transactions on Artificial Intelligence, 4(1):23–43, 2021.
  30. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4700–4708, 2017.
  31. Selective kernel networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 510–519, 2019.
  32. Joint face detection and alignment using multitask cascaded convolutional networks. IEEE signal processing letters, 23(10):1499–1503, 2016.
  33. Dssd: Deconvolutional single shot detector. arXiv preprint arXiv:1701.06659, 2017.
  34. Feature pyramid networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2117–2125, 2017.
  35. Refinenet: Multi-path refinement networks for high-resolution semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1925–1934, 2017.
  36. Pyramid scene parsing network. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2881–2890, 2017.
  37. Multiscale attention-based lstm for ship motion prediction. Ocean Engineering, 230:109066, 2021.
  38. Msa-net: Establishing reliable correspondences by multiscale attention network. IEEE Transactions on Image Processing, 31:4598–4608, 2022.
  39. Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552, 2017.
  40. mixup: Beyond empirical risk minimization. arXiv preprint arXiv:1710.09412, 2017.
  41. Cutmix: Regularization strategy to train strong classifiers with localizable features. In Proceedings of the IEEE/CVF international conference on computer vision, pages 6023–6032, 2019.
  42. Fast dropout training. In international conference on machine learning, pages 118–126. PMLR, 2013.
  43. Zoneout: Regularizing rnns by randomly preserving hidden activations. arXiv preprint arXiv:1606.01305, 2016.
  44. Dropblock: A regularization method for convolutional networks. Advances in neural information processing systems, 31, 2018.
  45. Maxdropout: deep neural network regularization based on maximum output values. In 2020 25th International Conference on Pattern Recognition (ICPR), pages 2671–2676. IEEE, 2021.
  46. Autodropout: Learning dropout patterns to regularize deep networks. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 9351–9359, 2021.
  47. Surrogate gradient learning in spiking neural networks: Bringing the power of gradient-based optimization to spiking neural networks. IEEE Signal Processing Magazine, 36(6):51–63, 2019.
  48. A low power, fully event-based gesture recognition system. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7243–7252, 2017.
  49. Cifar10-dvs: an event-stream dataset for object classification. Frontiers in neuroscience, 11:244131, 2017.
  50. Converting static image datasets to spiking neuromorphic datasets using saccades. Frontiers in neuroscience, 9:159859, 2015.
  51. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009.
  52. Spiking deep residual networks. IEEE Transactions on Neural Networks and Learning Systems, 34(8):5200–5205, 2021.
  53. Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807, 2020.
  54. Temporal efficient training of spiking neural network via gradient re-weighting. arXiv preprint arXiv:2202.11946, 2022.
  55. Going deeper with directly-trained larger spiking neural networks. In Proceedings of the AAAI conference on artificial intelligence, volume 35, pages 11062–11070, 2021.
  56. Optimal ann-snn conversion for high-accuracy and ultra-low-latency spiking neural networks. arXiv preprint arXiv:2303.04347, 2023.
  57. Fast-snn: Fast spiking neural network by converting quantized ann. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
  58. Spikformer: When spiking neural network meets transformer. arXiv preprint arXiv:2209.15425, 2022.
  59. Spike-driven-transformer. Advances in Neural Information Processing Systems, 36, 2024.
  60. Spike-driven transformer v2: Meta spiking neural network architecture inspiring the design of next-generation neuromorphic chips. arXiv preprint arXiv:2404.03663, 2024.
  61. Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In Proceedings of the IEEE/CVF international conference on computer vision, pages 2661–2671, 2021.
  62. Hats: Histograms of averaged time surfaces for robust event-based object classification. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1731–1740, 2018.
  63. Dart: distribution aware retinal transform for event-based cameras. IEEE transactions on pattern analysis and machine intelligence, 42(11):2767–2780, 2019.
  64. Optimizing deeper spiking neural networks for dynamic vision sensing. Neural Networks, 144:686–698, 2021.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Yimeng Shan (8 papers)
  2. Malu Zhang (43 papers)
  3. Xuerui Qiu (16 papers)
  4. Jason K. Eshraghian (33 papers)
  5. Haicheng Qu (5 papers)
  6. Rui-Jie Zhu (20 papers)
Citations (3)
X Twitter Logo Streamline Icon: https://streamlinehq.com