Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SFOD: Spiking Fusion Object Detector (2403.15192v1)

Published 22 Mar 2024 in cs.CV and cs.AI

Abstract: Event cameras, characterized by high temporal resolution, high dynamic range, low power consumption, and high pixel bandwidth, offer unique capabilities for object detection in specialized contexts. Despite these advantages, the inherent sparsity and asynchrony of event data pose challenges to existing object detection algorithms. Spiking Neural Networks (SNNs), inspired by the way the human brain codes and processes information, offer a potential solution to these difficulties. However, their performance in object detection using event cameras is limited in current implementations. In this paper, we propose the Spiking Fusion Object Detector (SFOD), a simple and efficient approach to SNN-based object detection. Specifically, we design a Spiking Fusion Module, achieving the first-time fusion of feature maps from different scales in SNNs applied to event cameras. Additionally, through integrating our analysis and experiments conducted during the pretraining of the backbone network on the NCAR dataset, we delve deeply into the impact of spiking decoding strategies and loss functions on model performance. Thereby, we establish state-of-the-art classification results based on SNNs, achieving 93.7\% accuracy on the NCAR dataset. Experimental results on the GEN1 detection dataset demonstrate that the SFOD achieves a state-of-the-art mAP of 32.1\%, outperforming existing SNN-based approaches. Our research not only underscores the potential of SNNs in object detection with event cameras but also propels the advancement of SNNs. Code is available at https://github.com/yimeng-fan/SFOD.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (47)
  1. Larry F Abbott. Lapicque’s introduction of the integrate-and-fire model neuron (1907). Brain research bulletin, 50(5-6):303–304, 1999.
  2. Inside-outside net: Detecting objects in context with skip pooling and recurrent neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2874–2883, 2016.
  3. A unified multi-scale deep convolutional neural network for fast object detection. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part IV 14, pages 354–370. Springer, 2016.
  4. Asynchronous convolutional networks for object detection in neuromorphic cameras. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 0–0, 2019.
  5. A differentiable recurrent surface for asynchronous event-based data. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XX 16, pages 136–152. Springer, 2020.
  6. Spiking deep convolutional neural networks for energy-efficient object recognition. International Journal of Computer Vision, 113:54–66, 2015.
  7. Semantic image segmentation with deep convolutional nets and fully connected crfs. arXiv preprint arXiv:1412.7062, 2014.
  8. Nicholas FY Chen. Pseudo-labels for supervised learning on dynamic vision sensor data, applied to object detection under ego-motion. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pages 644–653, 2018.
  9. Object detection with spiking neural networks on automotive event data. In 2022 International Joint Conference on Neural Networks (IJCNN), pages 1–8. IEEE, 2022.
  10. A large scale event-based detection dataset for automotive. 2020.
  11. Deep residual learning in spiking neural networks. Advances in Neural Information Processing Systems, 34:21056–21069, 2021a.
  12. Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In Proceedings of the IEEE/CVF international conference on computer vision, pages 2661–2671, 2021b.
  13. Event-based vision: A survey. IEEE transactions on pattern analysis and machine intelligence, 44(1):154–180, 2020.
  14. Yolox: Exceeding yolo series in 2021.
  15. Recurrent vision transformers for object detection with event cameras. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13884–13893, 2023.
  16. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision, pages 1026–1034, 2015.
  17. A quantitative description of membrane current and its application to conduction and excitation in nerve. The Journal of physiology, 117(4):500, 1952.
  18. Towards event-driven object detection with off-the-shelf deep learning. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 1–9. IEEE, 2018.
  19. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning, pages 448–456. pmlr, 2015.
  20. Eugene M Izhikevich. Simple model of spiking neurons. IEEE Transactions on neural networks, 14(6):1569–1572, 2003.
  21. Mixed frame-/event-driven fast pedestrian detection. In 2019 International Conference on Robotics and Automation (ICRA), pages 8332–8338. IEEE, 2019.
  22. Spiking-yolo: spiking neural network for energy-efficient object detection. In Proceedings of the AAAI conference on artificial intelligence, pages 11270–11277, 2020.
  23. Hypernet: Towards accurate region proposal generation and joint object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 845–853, 2016.
  24. Hybrid snn-ann: Energy-efficient classification and object detection for event-based vision. In DAGM German Conference on Pattern Recognition, pages 297–312. Springer, 2021.
  25. Asynchronous spatio-temporal memory network for continuous event-based object detection. IEEE Transactions on Image Processing, 31:2975–2987, 2022.
  26. Graph-based asynchronous event processing for rapid object recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 934–943, 2021.
  27. Feature pyramid networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2117–2125, 2017a.
  28. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision, pages 2980–2988, 2017b.
  29. Ssd: Single shot multibox detector. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, pages 21–37. Springer, 2016.
  30. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3431–3440, 2015.
  31. Sgdr: Stochastic gradient descent with warm restarts. In International Conference on Learning Representations, 2016.
  32. Decoupled weight decay regularization. In International Conference on Learning Representations, 2018.
  33. Wolfgang Maass. Networks of spiking neurons: the third generation of neural network models. Neural networks, 10(9):1659–1671, 1997.
  34. Event-based asynchronous sparse convolutional networks. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part VIII 16, pages 415–431. Springer, 2020.
  35. Surrogate gradient learning in spiking neural networks: Bringing the power of gradient-based optimization to spiking neural networks. IEEE Signal Processing Magazine, 36(6):51–63, 2019.
  36. Learning to detect objects with a 1 megapixel event camera. Advances in Neural Information Processing Systems, 33:16639–16652, 2020.
  37. Yolov3: An incremental improvement.
  38. You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 779–788, 2016.
  39. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, pages 234–241. Springer, 2015.
  40. Towards spike-based machine intelligence with neuromorphic computing. Nature, 575(7784):607–617, 2019.
  41. Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in neuroscience, 11:682, 2017a.
  42. Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in neuroscience, 11:682, 2017b.
  43. Aegnn: Asynchronous event-based graph neural networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12371–12381, 2022.
  44. Hats: Histograms of averaged time surfaces for robust event-based object classification. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1731–1740, 2018.
  45. Deep directly-trained spiking neural networks for object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 6555–6565, 2023.
  46. Small object detection based on modified fssd and model compression. In 2021 IEEE 6th International Conference on Signal and Image Processing (ICSIP), pages 88–92. IEEE, 2021.
  47. Event-based video reconstruction via potential-assisted spiking neural network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3594–3604, 2022.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Yimeng Fan (5 papers)
  2. Wei Zhang (1489 papers)
  3. Changsong Liu (10 papers)
  4. Mingyang Li (86 papers)
  5. Wenrui Lu (2 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.