Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Motion Robust High-Speed Light-Weighted Object Detection With Event Camera (2208.11602v2)

Published 24 Aug 2022 in cs.CV

Abstract: In this work, we propose a motion robust and high-speed detection pipeline which better leverages the event data. First, we design an event stream representation called temporal active focus (TAF), which efficiently utilizes the spatial-temporal asynchronous event stream, constructing event tensors robust to object motions. Then, we propose a module called the bifurcated folding module (BFM), which encodes the rich temporal information in the TAF tensor at the input layer of the detector. Following this, we design a high-speed lightweight detector called agile event detector (AED) plus a simple but effective data augmentation method, to enhance the detection accuracy and reduce the model's parameter. Experiments on two typical real-scene event camera object detection datasets show that our method is competitive in terms of accuracy, efficiency, and the number of parameters. By classifying objects into multiple motion levels based on the optical flow density metric, we further illustrated the robustness of our method for objects with different velocities relative to the camera. The codes and trained models are available at https://github.com/HarmoniaLeo/FRLW-EvD .

Definition Search Book Streamline Icon: https://streamlinehq.com
References (50)
  1. Y. Cai, T. Luan, H. Gao, H. Wang, L. Chen, Y. Li, M. A. Sotelo, and Z. Li, “Yolov4-5d: An effective and efficient object detector for autonomous driving,” IEEE Transactions on Instrumentation and Measurement, vol. 70, pp. 1–13, 2021.
  2. D. Feng, C. Haase-Schütz, L. Rosenbaum, H. Hertlein, C. Glaeser, F. Timm, W. Wiesbeck, and K. Dietmayer, “Deep multi-modal object detection and semantic segmentation for autonomous driving: Datasets, methods, and challenges,” IEEE Transactions on Intelligent Transportation Systems, vol. 22, no. 3, pp. 1341–1360, 2020.
  3. S. Cattini, D. Cassanelli, G. Di Loro, L. Di Cecilia, L. Ferrari, and L. Rovati, “Analysis, quantification, and discussion of the approximations introduced by pulsed 3-d lidars,” IEEE Transactions on Instrumentation and Measurement, vol. 70, pp. 1–11, 2021.
  4. W. Jang, M. Park, and E. Kim, “Real-time driving scene understanding via efficient 3-d lidar processing,” IEEE Transactions on Instrumentation and Measurement, vol. 71, pp. 1–14, 2022.
  5. X. Li, Y. Zhou, and B. Hua, “Study of a multi-beam lidar perception assessment model for real-time autonomous driving,” IEEE Transactions on Instrumentation and Measurement, vol. 70, pp. 1–15, 2021.
  6. X. Chen, H. Ma, J. Wan, B. Li, and T. Xia, “Multi-view 3d object detection network for autonomous driving,” in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2017, pp. 1907–1915.
  7. Z. Zhang, X. Wang, D. Huang, X. Fang, M. Zhou, and Y. Zhang, “Mrpt: Millimeter-wave radar-based pedestrian trajectory tracking for autonomous urban driving,” IEEE Transactions on Instrumentation and Measurement, vol. 71, pp. 1–17, 2021.
  8. S. Pagad, D. Agarwal, S. Narayanan, K. Rangan, H. Kim, and G. Yalla, “Robust method for removing dynamic objects from point clouds,” in 2020 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2020, pp. 10 765–10 771.
  9. S. Shirmohammadi and A. Ferrero, “Camera as the instrument: the rising trend of vision based measurement,” IEEE Instrumentation & Measurement Magazine, vol. 17, no. 3, pp. 41–47, 2014.
  10. T. Serrano-Gotarredona and B. Linares-Barranco, “A 128 ×\times×128 1.5% contrast sensitivity 0.9% fpn 3 μ𝜇\muitalic_μs latency 4 mw asynchronous frame-free dynamic vision sensor using transimpedance preamplifiers,” IEEE Journal of Solid-State Circuits, vol. 48, no. 3, pp. 827–838, 2013.
  11. P. Lichtsteiner, C. Posch, and T. Delbruck, “A 128× 128 120 db 15 μ𝜇\muitalic_μs latency asynchronous temporal contrast vision sensor,” IEEE Journal of Solid-State Circuits, vol. 43, no. 2, pp. 566–576, 2008.
  12. T. Finateu, A. Niwa, D. Matolin, K. Tsuchimoto, A. Mascheroni, E. Reynaud, P. Mostafalu, F. Brady, L. Chotard, F. LeGoff et al., “5.10 a 1280×\times× 720 back-illuminated stacked temporal contrast event-based vision sensor with 4.86 μ𝜇\muitalic_μm pixels, 1.066 geps readout, programmable event-rate controller and compressive data-formatting pipeline,” in 2020 IEEE International Solid-State Circuits Conference-(ISSCC).   IEEE, 2020, pp. 112–114.
  13. H. Cao, G. Chen, Z. Li, Y. Hu, and A. Knoll, “Neurograsp: multimodal neural network with euler region regression for neuromorphic vision-based grasp pose estimation,” IEEE Transactions on Instrumentation and Measurement, vol. 71, pp. 1–11, 2022.
  14. C. Posch, T. Serrano-Gotarredona, B. Linares-Barranco, and T. Delbruck, “Retinomorphic event-based vision sensors: bioinspired cameras with spiking output,” Proceedings of the IEEE, vol. 102, no. 10, pp. 1470–1484, 2014.
  15. G. Gallego, T. Delbrück, G. Orchard, C. Bartolozzi, B. Taba, A. Censi, S. Leutenegger, A. J. Davison, J. Conradt, K. Daniilidis et al., “Event-based vision: A survey,” IEEE transactions on pattern analysis and machine intelligence, vol. 44, no. 1, pp. 154–180, 2020.
  16. G. Chen, H. Cao, J. Conradt, H. Tang, F. Rohrbein, and A. Knoll, “Event-based neuromorphic vision for autonomous driving: A paradigm shift for bio-inspired visual sensing and perception,” IEEE Signal Processing Magazine, vol. 37, no. 4, pp. 34–49, 2020.
  17. A. I. Maqueda, A. Loquercio, G. Gallego, N. García, and D. Scaramuzza, “Event-based vision meets deep learning on steering prediction for self-driving cars,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 5419–5427.
  18. J. Binas, D. Neil, S.-C. Liu, and T. Delbruck, “Ddd17: End-to-end davis driving dataset,” arXiv preprint arXiv:1711.01458, 2017.
  19. M. Cannici, M. Ciccone, A. Romanoni, and M. Matteucci, “Asynchronous convolutional networks for object detection in neuromorphic cameras,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2019, pp. 1656–1665.
  20. E. Perot, P. de Tournemire, D. Nitti, J. Masci, and A. Sironi, “Learning to detect objects with a 1 megapixel event camera,” in Advances in Neural Information Processing Systems, vol. 33, 2020, pp. 16 639–16 652.
  21. J. Li, J. Li, L. Zhu, X. Xiang, T. Huang, and Y. Tian, “Asynchronous spatio-temporal memory network for continuous event-based object detection,” IEEE Transactions on Image Processing, vol. 31, pp. 2975–2987, 2022.
  22. N. F. Chen, “Pseudo-labels for supervised learning on dynamic vision sensor data, applied to object detection under ego-motion,” in Proceedings of the IEEE conference on computer vision and pattern recognition workshops, 2018, pp. 644–653.
  23. N. Messikommer, D. Gehrig, A. Loquercio, and D. Scaramuzza, “Event-based asynchronous sparse convolutional networks,” in European Conference on Computer Vision.   Springer, 2020, pp. 415–431.
  24. J. Li, S. Dong, Z. Yu, Y. Tian, and T. Huang, “Event-based vision enhanced: A joint detection framework in autonomous driving,” in 2019 IEEE International Conference on Multimedia and Expo (ICME).   IEEE, 2019, pp. 1396–1401.
  25. D. Gehrig, A. Loquercio, K. G. Derpanis, and D. Scaramuzza, “End-to-end learning of representations for asynchronous event-based data,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 5633–5643.
  26. P. de Tournemire, D. Nitti, E. Perot, D. Migliore, and A. Sironi, “A large scale event-based detection dataset for automotive,” arXiv preprint arXiv:2001.08499, 2020.
  27. A. Z. Zhu and L. Yuan, “Ev-flownet: Self-supervised optical flow estimation for event-based cameras,” in Robotics: Science and Systems, 2018.
  28. H. Rebecq, T. Horstschaefer, and D. Scaramuzza, “Real-time visual-inertial odometry for event cameras using keyframe-based nonlinear optimization,” in British Machine Vision Conference (BMVC), 2017.
  29. A. Z. Zhu, L. Yuan, K. Chaney, and K. Daniilidis, “Unsupervised event-based learning of optical flow, depth, and egomotion,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 989–997.
  30. R. Benosman, C. Clercq, X. Lagorce, S.-H. Ieng, and C. Bartolozzi, “Event-based visual flow,” IEEE transactions on neural networks and learning systems, vol. 25, no. 2, pp. 407–417, 2013.
  31. R. Baldwin, R. Liu, M. M. Almatrafi, V. K. Asari, and K. Hirakawa, “Time-ordered recent event (tore) volumes for event cameras,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 2, pp. 2519–2532, 2023.
  32. Y. Hu, T. Delbruck, and S.-C. Liu, “Learning to exploit multiple vision modalities by using grafted networks,” in European Conference on Computer Vision.   Springer, 2020, pp. 85–101.
  33. J. Nagata, Y. Sekikawa, and Y. Aoki, “Optical flow estimation by matching time surfa ce with event-based cameras,” Sensors, vol. 21, no. 4, p. 1150, 2021.
  34. C. Lea, M. D. Flynn, R. Vidal, A. Reiter, and G. D. Hager, “Temporal convolutional networks for action segmentation and detection,” in proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 156–165.
  35. A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, “Yolov4: Optimal speed and accuracy of object detection,” arXiv preprint arXiv:2004.10934, 2020.
  36. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
  37. M. Jordan, “Serial order: A parallel distributed processing approach,” in Advances in psychology.   Elsevier, 1997, vol. 121, pp. 471–495.
  38. J. S. S. Hochreiter, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997.
  39. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
  40. S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-time object detection with region proposal networks,” Advances in neural information processing systems, vol. 28, 2015.
  41. K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask r-cnn,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2961–2969.
  42. T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal loss for dense object detection,” in IEEE International Conference on Computer Vision, 2017, pp. 2980–2988.
  43. N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko, “End-to-end object detection with transformers,” in European Conference on Computer Vision.   Springer, 2020, pp. 213–229.
  44. Z. Ge, S. Liu, F. Wang, Z. Li, and J. Sun, “Yolox: Exceeding yolo series in 2021,” arXiv preprint arXiv:2107.08430, 2021.
  45. J. Redmon and A. Farhadi, “Yolov3: An incremental improvement,” arXiv preprint arXiv:1804.02767, 2018.
  46. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in The 3rd International Conference for Learning Representations, 2015.
  47. J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 779–788.
  48. Z. Jiang, P. Xia, K. Huang, W. Stechele, G. Chen, Z. Bing, and A. Knoll, “Mixed frame-/event-driven fast pedestrian detection,” in 2019 International Conference on Robotics and Automation (ICRA).   IEEE, 2019, pp. 8332–8338.
  49. W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg, “Ssd: Single shot multibox detector,” in European conference on computer vision.   Springer, 2016, pp. 21–37.
  50. X. Shi, Z. Chen, H. Wang, D.-Y. Yeung, W.-K. Wong, and W.-c. Woo, “Convolutional lstm network: A machine learning approach for precipitation nowcasting,” in Advances in neural information processing systems, vol. 28, 2015, pp. 802–810.
Citations (24)

Summary

We haven't generated a summary for this paper yet.