Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Accelerated Event-Based Feature Detection and Compression for Surveillance Video Systems (2312.08213v2)

Published 13 Dec 2023 in cs.MM and cs.CV

Abstract: The strong temporal consistency of surveillance video enables compelling compression performance with traditional methods, but downstream vision applications operate on decoded image frames with a high data rate. Since it is not straightforward for applications to extract information on temporal redundancy from the compressed video representations, we propose a novel system which conveys temporal redundancy within a sparse decompressed representation. We leverage a video representation framework called ADDER to transcode framed videos to sparse, asynchronous intensity samples. We introduce mechanisms for content adaptation, lossy compression, and asynchronous forms of classical vision algorithms. We evaluate our system on the VIRAT surveillance video dataset, and we show a median 43.7% speed improvement in FAST feature detection compared to OpenCV. We run the same algorithm as OpenCV, but only process pixels that receive new asynchronous events, rather than process every pixel in an image frame. Our work paves the way for upcoming neuromorphic sensors and is amenable to future applications with spiking neural networks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (47)
  1. Time-Ordered Recent Event (TORE) Volumes for Event Cameras. IEEE Transactions on Pattern Analysis and Machine Intelligence (2022), 1–1. https://doi.org/10.1109/TPAMI.2022.3172212
  2. Spike Timing-Based Unsupervised Learning of Orientation, Disparity, and Motion Representations in a Spiking Neural Network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops. 1377–1386.
  3. Event-Based Visual Flow. IEEE Transactions on Neural Networks and Learning Systems 25, 2 (2014), 407–417. https://doi.org/10.1109/TNNLS.2013.2273537
  4. G. Bradski. 2000. The OpenCV Library. Dr. Dobb’s Journal of Software Tools (2000).
  5. Asynchronous Convolutional Networks for Object Detection in Neuromorphic Cameras. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). 1656–1665. https://doi.org/10.1109/CVPRW.2019.00209
  6. NeRV: Neural Representations for Videos. In Advances in Neural Information Processing Systems, A. Beygelzimer, Y. Dauphin, P. Liang, and J. Wortman Vaughan (Eds.). https://openreview.net/forum?id=BbikqBWZTGB
  7. Focal Sparse Convolutional Networks for 3D Object Detection. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society, Los Alamitos, CA, USA, 5418–5427. https://doi.org/10.1109/CVPR52688.2022.00535
  8. Video-based face recognition via joint sparse representation. In 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG). 1–8. https://doi.org/10.1109/FG.2013.6553787
  9. Spiking Cooperative Stereo-Matching at 2 ms Latency with Neuromorphic Hardware. In Biomimetic and Biohybrid Systems, Michael Mangan, Mark Cutkosky, Anna Mura, Paul F.M.J. Verschure, Tony Prescott, and Nathan Lepora (Eds.). Springer International Publishing, Cham, 119–137.
  10. Video Frame Interpolation: A Comprehensive Survey. ACM Trans. Multimedia Comput. Commun. Appl. 19, 2s, Article 78 (may 2023), 31 pages. https://doi.org/10.1145/3556544
  11. CV-C3D: Action Recognition on Compressed Videos with Convolutional 3D Networks. In 2019 32nd SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI). 24–30. https://doi.org/10.1109/SIBGRAPI.2019.00012
  12. Image Reconstruction From Neuromorphic Event Cameras Using Laplacian-Prediction and Poisson Integration With Spiking and Artificial Neural Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops. 1333–1341.
  13. Kynan Eng. 2023. Kynan Eng at CVPR 2023 Workshop on Event-based Vision. Youtube. https://www.youtube.com/watch?v=tv-GqKg4Mak&ab_channel=RPGWorkshops
  14. FFmpeg Project. 2021. FFmpeg. https://ffmpeg.org/
  15. Andrew C. Freeman. 2023. The ADDER Framework: Tools for Event Video Representations. In Proceedings of the 14th Conference on ACM Multimedia Systems, MMSys 2023, Vancouver, BC, Canada, June 7-10, 2023. ACM, 343–347. https://doi.org/10.1145/3587819.3593028
  16. Motion Segmentation and Tracking for Integrating Event Cameras. In Proceedings of the 12th ACM Multimedia Systems Conference (Istanbul, Turkey) (MMSys ’21). Association for Computing Machinery, New York, NY, USA, 1–11. https://doi.org/10.1145/3458305.3463373
  17. Andrew C. Freeman and Ketan Mayer-Patel. 2020. Integrating Event Camera Sensor Emulator. In Proceedings of the 28th ACM International Conference on Multimedia (Seattle, WA, USA) (MM ’20). Association for Computing Machinery, New York, NY, USA, 4503–4505. https://doi.org/10.1145/3394171.3414394
  18. Andrew C. Freeman and Ketan Mayer-Patel. 2021. Lossy Compression for Integrating Event Cameras. In 2021 Data Compression Conference (DCC). 53–62. https://doi.org/10.1109/DCC50243.2021.00013
  19. An Asynchronous Intensity Representation for Framed and Event Video Sources. In Proceedings of the 14th ACM Multimedia Systems Conference (Vancouver, BC, Canada) (MMSys ’23). Association for Computing Machinery, New York, NY, USA, 1–12. https://doi.org/10.1145/3587819.3590969
  20. Event-based Vision: A Survey. IEEE Transactions on Pattern Analysis and Machine Intelligence (2020), 1–1. https://doi.org/10.1109/TPAMI.2020.3008413
  21. End-to-End Learning of Representations for Asynchronous Event-Based Data. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV).
  22. 3D Semantic Segmentation with Submanifold Sparse Convolutional Networks. CVPR (2018).
  23. Alain Horé and Djemel Ziou. 2010. Image Quality Metrics: PSNR vs. SSIM. In 2010 20th International Conference on Pattern Recognition. 2366–2369. https://doi.org/10.1109/ICPR.2010.579
  24. Indexed Operations for Non-rectangular Lattices Applied to Convolutional Neural Networks. In Proceedings of the 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2019) - Volume 5: VISAPP. INSTICC, SciTePress, 362–371. https://doi.org/10.5220/0007364303620371
  25. Low-Light Image and Video Enhancement Using Deep Learning: A Survey. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 12 (2022), 9396–9416. https://doi.org/10.1109/TPAMI.2021.3126387
  26. Toward a practical perceptual video quality metric. The Netflix Tech Blog 6, 2 (2016).
  27. A 128 X 128 120db 30mw asynchronous vision sensor that responds to relative intensity change. In 2006 IEEE International Solid State Circuits Conference - Digest of Technical Papers. 2060–2069.
  28. Video Super-Resolution Based on Deep Learning: A Comprehensive Survey. Artif. Intell. Rev. 55, 8 (dec 2022), 5981–6035. https://doi.org/10.1007/s10462-022-10147-y
  29. Event-Based Vision Meets Deep Learning on Steering Prediction for Self-Driving Cars. 5419–5427. https://doi.org/10.1109/CVPR.2018.00568
  30. Event-based Asynchronous Sparse Convolutional Networks. European Conference on Computer Vision. (ECCV). http://rpg.ifi.uzh.ch/docs/ECCV20_Messikommer.pdf
  31. A large-scale benchmark dataset for event recognition in surveillance video. In CVPR 2011. 3153–3160. https://doi.org/10.1109/CVPR.2011.5995586
  32. Bringing a Blurry Frame Alive at High Frame-Rate With an Event Camera. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
  33. DeltaCNN: End-to-End CNN Inference of Sparse Frame Differences in Videos. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 12487–12496. https://doi.org/10.1109/CVPR52688.2022.01217
  34. An overview of the basic principles of the Q-Coder adaptive binary arithmetic coder. IBM Journal of Research and Development 32, 6 (1988), 717–726. https://doi.org/10.1147/rd.326.0717
  35. Reza Rassool. 2017. VMAF reproducibility: Validating a perceptual practical video quality metric. In 2017 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB). 1–2. https://doi.org/10.1109/BMSB.2017.7986143
  36. Real-time Visual-Inertial Odometry for Event Cameras using Keyframe-based Nonlinear Optimization. https://doi.org/10.5244/C.31.16
  37. E. Rosten and T. Drummond. 2005. Fusing points and lines for high performance tracking. In Tenth IEEE International Conference on Computer Vision (ICCV’05) Volume 1, Vol. 2. 1508–1515 Vol. 2. https://doi.org/10.1109/ICCV.2005.104
  38. Sourav Dey Roy and Mrinal Kanti Bhowmik. 2020. A Comprehensive Survey on Computer Vision Based Approaches for Moving Object Detection. In 2020 IEEE Region 10 Symposium (TENSYMP). 1531–1534. https://doi.org/10.1109/TENSYMP50017.2020.9230869
  39. E-CIR: Event-Enhanced Continuous Intensity Recovery. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 7793–7802. https://doi.org/10.1109/CVPR52688.2022.00765
  40. Overview of the High Efficiency Video Coding (HEVC) Standard. IEEE Transactions on Circuits and Systems for Video Technology 22, 12 (2012), 1649–1668. https://doi.org/10.1109/TCSVT.2012.2221191
  41. Event Enhanced High-Quality Image Recovery. In European Conference on Computer Vision. Springer.
  42. Compressed Vision for Efficient Video Understanding. In Computer Vision – ACCV 2022, Lei Wang, Juergen Gall, Tat-Jun Chin, Imari Sato, and Rama Chellappa (Eds.). Springer Nature Switzerland, Cham, 679–695.
  43. Compressed Video Action Recognition. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society, Los Alamitos, CA, USA, 6026–6035. https://doi.org/10.1109/CVPR.2018.00631
  44. Learning in the Frequency Domain. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society, Los Alamitos, CA, USA, 1737–1746. https://doi.org/10.1109/CVPR42600.2020.00181
  45. Task-Driven Video Compression for Humans and Machines: Framework Design and Optimization. IEEE Transactions on Multimedia (2022), 1–12. https://doi.org/10.1109/TMM.2022.3233245
  46. EV-FlowNet: Self-Supervised Optical Flow Estimation for Event-based Cameras. https://doi.org/10.15607/RSS.2018.XIV.062
  47. Machine-Learning-Based Method for Content-Adaptive Video Encoding. In 2021 Picture Coding Symposium (PCS). 1–5. https://doi.org/10.1109/PCS50896.2021.9477507
Citations (2)

Summary

We haven't generated a summary for this paper yet.