Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Spike-EVPR: Deep Spiking Residual Network with Cross-Representation Aggregation for Event-Based Visual Place Recognition (2402.10476v1)

Published 16 Feb 2024 in cs.CV

Abstract: Event cameras have been successfully applied to visual place recognition (VPR) tasks by using deep artificial neural networks (ANNs) in recent years. However, previously proposed deep ANN architectures are often unable to harness the abundant temporal information presented in event streams. In contrast, deep spiking networks exhibit more intricate spatiotemporal dynamics and are inherently well-suited to process sparse asynchronous event streams. Unfortunately, directly inputting temporal-dense event volumes into the spiking network introduces excessive time steps, resulting in prohibitively high training costs for large-scale VPR tasks. To address the aforementioned issues, we propose a novel deep spiking network architecture called Spike-EVPR for event-based VPR tasks. First, we introduce two novel event representations tailored for SNN to fully exploit the spatio-temporal information from the event streams, and reduce the video memory occupation during training as much as possible. Then, to exploit the full potential of these two representations, we construct a Bifurcated Spike Residual Encoder (BSR-Encoder) with powerful representational capabilities to better extract the high-level features from the two event representations. Next, we introduce a Shared & Specific Descriptor Extractor (SSD-Extractor). This module is designed to extract features shared between the two representations and features specific to each. Finally, we propose a Cross-Descriptor Aggregation Module (CDA-Module) that fuses the above three features to generate a refined, robust global descriptor of the scene. Our experimental results indicate the superior performance of our Spike-EVPR compared to several existing EVPR pipelines on Brisbane-Event-VPR and DDD20 datasets, with the average Recall@1 increased by 7.61% on Brisbane and 13.20% on DDD20.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (48)
  1. G. Gallego, T. Delbrück, G. Orchard, C. Bartolozzi, B. Taba, A. Censi, S. Leutenegger, A. J. Davison, J. Conradt, K. Daniilidis et al., “Event-based vision: A survey,” IEEE transactions on pattern analysis and machine intelligence, vol. 44, no. 1, pp. 154–180, 2020.
  2. G. Chen, H. Cao, J. Conradt, H. Tang, F. Rohrbein, and A. Knoll, “Event-based neuromorphic vision for autonomous driving: A paradigm shift for bio-inspired visual sensing and perception,” IEEE Signal Processing Magazine, vol. 37, no. 4, pp. 34–49, 2020.
  3. X. Zheng, Y. Liu, Y. Lu, T. Hua, T. Pan, W. Zhang, D. Tao, and L. Wang, “Deep learning for event-based vision: A comprehensive survey and benchmarks,” arXiv preprint arXiv:2302.08890, 2023.
  4. H. Zhuang, X. Huang, K. Hou, D. Kong, C. Hu, and Z. Fang, “Ev-mgrflownet: Motion-guided recurrent network for unsupervised event-based optical flow with hybrid motion-compensation loss,” arXiv preprint arXiv:2305.07853, 2023.
  5. T.-H. Wu, C. Gong, D. Kong, S. Xu, and Q. Liu, “A novel visual object detection and distance estimation method for hdr scenes based on event camera,” in International Conference on Computer and Communications, 2021, pp. 636–640.
  6. L. Pan, C. Scheerlinck, X. Yu, R. Hartley, M. Liu, and Y. Dai, “Bringing a blurry frame alive at high frame-rate with an event camera,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 6820–6829.
  7. K. Hou, D. Kong, J. Jiang, H. Zhuang, X. Huang, and Z. Fang, “Fe-fusion-vpr: Attention-based multi-scale network architecture for visual place recognition by fusing frames and events,” IEEE Robotics and Automation Letters, 2023.
  8. S. Lowry, N. Sünderhauf, P. Newman, J. J. Leonard, D. Cox, P. Corke, and M. J. Milford, “Visual place recognition: A survey,” IEEE transactions on robotics, vol. 32, no. 1, pp. 1–19, 2015.
  9. X. Zhang, L. Wang, and Y. Su, “Visual place recognition: A survey from deep learning perspective,” Pattern Recognition, vol. 113, p. 107760, 2021.
  10. C. Masone and B. Caputo, “A survey on deep visual place recognition,” IEEE Access, vol. 9, pp. 19 516–19 547, 2021.
  11. S. Garg, T. Fischer, and M. Milford, “Where is your place, visual place recognition?” arXiv preprint arXiv:2103.06443, 2021.
  12. T. Fischer and M. Milford, “Event-based visual place recognition with ensembles of temporal windows,” IEEE Robotics and Automation Letters, vol. 5, no. 4, pp. 6924–6931, 2020.
  13. A. J. Lee and A. Kim, “Eventvlad: Visual place recognition with reconstructed edges from event cameras,” in IEEE/RSJ International Conference on Intelligent Robots and Systems, 2021, pp. 2247–2252.
  14. D. Kong, Z. Fang, K. Hou, H. Li, J. Jiang, S. Coleman, and D. Kerr, “Event-vpr: End-to-end weakly supervised deep network architecture for visual place recognition using event-based vision sensor,” IEEE Transactions on Instrumentation and Measurement, vol. 71, pp. 1–18, 2022.
  15. Z. Huang, R. Huang, L. Sun, C. Zhao, M. Huang, and S. Su, “Vefnet: an event-rgb cross modality fusion network for visual place recognition,” in IEEE International Conference on Image Processing, 2022, pp. 2671–2675.
  16. T. Fischer and M. Milford, “How many events do you need? event-based visual place recognition using sparse but varying pixels,” IEEE Robotics and Automation Letters, vol. 7, no. 4, pp. 12 275–12 282, 2022.
  17. K. Roy, A. Jaiswal, and P. Panda, “Towards spike-based machine intelligence with neuromorphic computing,” Nature, vol. 575, no. 7784, pp. 607–617, 2019.
  18. W. Maass, “Networks of spiking neurons: the third generation of neural network models,” Neural networks, vol. 10, no. 9, pp. 1659–1671, 1997.
  19. R. Massa, A. Marchisio, M. Martina, and M. Shafique, “An efficient spiking neural network for recognizing gestures with a dvs camera on the loihi neuromorphic processor,” in 2020 International Joint Conference on Neural Networks (IJCNN).   IEEE, 2020, pp. 1–9.
  20. Y. Xing, G. Di Caterina, and J. Soraghan, “A new spiking convolutional recurrent neural network (scrnn) with applications to event-based hand gesture recognition,” Frontiers in neuroscience, vol. 14, p. 590164, 2020.
  21. D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” International journal of computer vision, vol. 60, no. 2, pp. 91–110, 2004.
  22. A. Angeli, D. Filliat, S. Doncieux, and J.-A. Meyer, “Fast and incremental method for loop-closure detection using bags of visual words,” IEEE transactions on robotics, vol. 24, no. 5, pp. 1027–1037, 2008.
  23. E. Rublee, V. Rabaud, K. Konolige, and G. Bradski, “Orb: An efficient alternative to sift or surf,” in International conference on computer vision, 2011, pp. 2564–2571.
  24. D. Gálvez-López and J. D. Tardos, “Bags of binary words for fast place recognition in image sequences,” IEEE Transactions on Robotics, vol. 28, no. 5, pp. 1188–1197, 2012.
  25. R. Arandjelovic, P. Gronat, A. Torii, T. Pajdla, and J. Sivic, “Netvlad: Cnn architecture for weakly supervised place recognition,” in IEEE conference on computer vision and pattern recognition, 2016, pp. 5297–5307.
  26. H. Jégou, M. Douze, C. Schmid, and P. Pérez, “Aggregating local descriptors into a compact image representation,” in IEEE computer society conference on computer vision and pattern recognition, 2010, pp. 3304–3311.
  27. J. Yu, C. Zhu, J. Zhang, Q. Huang, and D. Tao, “Spatial pyramid-enhanced netvlad with weighted triplet loss for place recognition,” IEEE transactions on neural networks and learning systems, vol. 31, no. 2, pp. 661–674, 2019.
  28. S. Hausler, S. Garg, M. Xu, M. Milford, and T. Fischer, “Patch-netvlad: Multi-scale fusion of locally-global descriptors for place recognition,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 14 141–14 152.
  29. G. Berton, C. Masone, and B. Caputo, “Rethinking visual geo-localization for large-scale applications,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 4878–4888.
  30. A. Ali-Bey, B. Chaib-Draa, and P. Giguere, “Mixvpr: Feature mixing for visual place recognition,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023, pp. 2998–3007.
  31. A. Ali-bey, B. Chaib-draa, and P. Giguère, “Gsv-cities: Toward appropriate supervised visual place recognition,” Neurocomputing, vol. 513, pp. 194–203, 2022.
  32. N. Keetha, A. Mishra, J. Karhade, K. M. Jatavallabhula, S. Scherer, M. Krishna, and S. Garg, “Anyloc: Towards universal visual place recognition,” IEEE Robotics and Automation Letters, 2023.
  33. M. Oquab, T. Darcet, T. Moutakanni, H. Vo, M. Szafraniec, V. Khalidov, P. Fernandez, D. Haziza, F. Massa, A. El-Nouby et al., “Dinov2: Learning robust visual features without supervision,” arXiv preprint arXiv:2304.07193, 2023.
  34. E. Hunsberger and C. Eliasmith, “Spiking deep networks with lif neurons,” arXiv preprint arXiv:1510.08829, 2015.
  35. B. Rueckauer, I.-A. Lungu, Y. Hu, M. Pfeiffer, and S.-C. Liu, “Conversion of continuous-valued deep networks to efficient event-driven networks for image classification,” Frontiers in neuroscience, vol. 11, p. 682, 2017.
  36. B. Han and K. Roy, “Deep spiking neural network: Energy efficiency through time based coding,” in European Conference on Computer Vision.   Springer, 2020, pp. 388–404.
  37. S. Deng and S. Gu, “Optimal conversion of conventional artificial neural networks to spiking neural networks,” arXiv preprint arXiv:2103.00476, 2021.
  38. E. O. Neftci, H. Mostafa, and F. Zenke, “Surrogate gradient learning in spiking neural networks: Bringing the power of gradient-based optimization to spiking neural networks,” IEEE Signal Processing Magazine, vol. 36, no. 6, pp. 51–63, 2019.
  39. F. Yang, L. Su, J. Zhao, X. Chen, X. Wang, N. Jiang, and Q. Hu, “Sa-flownet: Event-based self-attention optical flow estimation with spiking-analogue neural networks,” IET Computer Vision, 2023.
  40. C. M. Parameshwara, S. Li, C. Fermüller, N. J. Sanket, M. S. Evanusa, and Y. Aloimonos, “Spikems: Deep spiking neural network for motion segmentation,” in 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2021, pp. 3414–3420.
  41. F. Gu, W. Sng, T. Taunyazov, and H. Soh, “Tactilesgnet: A spiking graph neural network for event-based tactile object recognition,” in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2020, pp. 9876–9882.
  42. X. Wu, W. He, M. Yao, Z. Zhang, Y. Wang, and G. Li, “Mss-depthnet: Depth prediction with multi-step spiking neural network,” arXiv preprint arXiv:2211.12156, 2022.
  43. H. Lee and H. Hwang, “Ev-reconnet: Visual place recognition using event camera with spiking neural networks,” IEEE Sensors Journal, 2023.
  44. A. N. Burkitt, “A review of the integrate-and-fire neuron model: I. homogeneous synaptic input,” Biological cybernetics, vol. 95, pp. 1–19, 2006.
  45. X. Lagorce, G. Orchard, F. Galluppi, B. E. Shi, and R. B. Benosman, “Hots: a hierarchy of event-based time-surfaces for pattern recognition,” IEEE transactions on pattern analysis and machine intelligence, vol. 39, no. 7, pp. 1346–1359, 2016.
  46. Y. Hu, J. Binas, D. Neil, S.-C. Liu, and T. Delbruck, “Ddd20 end-to-end event camera driving dataset: Fusing frames and events with deep learning for improved steering prediction,” in International Conference on Intelligent Transportation Systems, 2020, pp. 1–6.
  47. J. Jiang, D. Kong, K. Hou, X. Huang, H. Zhuang, and Z. Fang, “Neuro-planner: A 3d visual navigation method for mav with depth camera based on neuromorphic reinforcement learning,” IEEE Transactions on Vehicular Technology, vol. 72, no. 10, pp. 12 697–12 712, 2023.
  48. F. Yu, Y. Wu, S. Ma, M. Xu, H. Li, H. Qu, C. Song, T. Wang, R. Zhao, and L. Shi, “Brain-inspired multimodal hybrid neural network for robot place recognition,” Science Robotics, vol. 8, no. 78, p. eabm6996, 2023.
Citations (2)

Summary

We haven't generated a summary for this paper yet.