Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Uncertainty-aware Bridge based Mobile-Former Network for Event-based Pattern Recognition (2401.11123v2)

Published 20 Jan 2024 in cs.CV

Abstract: The mainstream human activity recognition (HAR) algorithms are developed based on RGB cameras, which are easily influenced by low-quality images (e.g., low illumination, motion blur). Meanwhile, the privacy protection issue caused by ultra-high definition (HD) RGB cameras aroused more and more people's attention. Inspired by the success of event cameras which perform better on high dynamic range, no motion blur, and low energy consumption, we propose to recognize human actions based on the event stream. We propose a lightweight uncertainty-aware information propagation based Mobile-Former network for efficient pattern recognition, which aggregates the MobileNet and Transformer network effectively. Specifically, we first embed the event images using a stem network into feature representations, then, feed them into uncertainty-aware Mobile-Former blocks for local and global feature learning and fusion. Finally, the features from MobileNet and Transformer branches are concatenated for pattern recognition. Extensive experiments on multiple event-based recognition datasets fully validated the effectiveness of our model. The source code of this work will be released at https://github.com/Event-AHU/Uncertainty_aware_MobileFormer.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (48)
  1. Graph convolutional neural network for action recognition: A comprehensive survey. IEEE Transactions on Artificial Intelligence, 2021.
  2. Time-ordered recent event (tore) volumes for event cameras. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022.
  3. Context-driven multi-stream lstm (m-lstm) for recognizing fine-grained activity of drivers. In German Conference on Pattern Recognition, pages 298–314. Springer, 2018.
  4. Graph-based object classification for neuromorphic vision sensing. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 491–501, 2019.
  5. Graph-based spatio-temporal feature learning for neuromorphic vision sensing. IEEE Transactions on Image Processing, 29:9084–9098, 2020.
  6. S. Chen and M. Guo. Live demonstration: Celex-v: a 1m pixel multi-mode event-based sensor. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 1682–1683. IEEE, 2019.
  7. Mobile-former: Bridging mobilenet and transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5270–5279, 2022.
  8. Evvgcnn: A voxel graph cnn for event-based object classification. arXiv preprint arXiv:2106.00216, 1(2):6, 2021.
  9. Mvf-net: A multi-view fusion network for event-based object classification. IEEE Transactions on Circuits and Systems for Video Technology, 32(12):8275–8284, 2021.
  10. Amae: Adaptive motion-agnostic encoder for event-based object classification. IEEE Robotics and Automation Letters, 5(3):4596–4603, 2020.
  11. Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In 2015 International joint conference on neural networks (IJCNN), pages 1–8. ieee, 2015.
  12. C. Doersch. Tutorial on variational autoencoders. arXiv preprint arXiv:1606.05908, 2016.
  13. Exploiting neuron and synapse filter dynamics in spatial temporal learning of deep spiking neural network. In Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence, pages 2799–2806, 2021.
  14. Deep residual learning in spiking neural networks. NeurIPS, 2021.
  15. Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021.
  16. Udnet: Uncertainty-aware deep network for salient object detection. Pattern Recognition, 134:109099, 2023.
  17. Event-based vision: A survey. IEEE transactions on pattern analysis and machine intelligence, 44(1):154–180, 2020.
  18. End-to-end learning of representations for asynchronous event-based data. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5633–5643, 2019.
  19. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861, 2017.
  20. Svm-pca based handwritten devanagari digit character recognition. Recent Advances in Computer Science and Communications (Formerly: Recent Patents on Computer Science), 14(1):48–53, 2021.
  21. Transformers in vision: A survey. arXiv preprint arXiv:2101.01169, 2021.
  22. Variational dropout and the local reparameterization trick. Advances in neural information processing systems, 28, 2015.
  23. Y. Kong and Y. Fu. Human action recognition and prediction: A survey. arXiv preprint arXiv:1806.11230, 2018.
  24. Uncertainty-aware label distribution learning for facial expression recognition. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 6088–6097, 2023.
  25. Incremental pedestrian attribute recognition via dual uncertainty-aware pseudo-labeling. IEEE Transactions on Information Forensics and Security, 2023.
  26. Event transformer. arXiv preprint arXiv:2204.05172, 2022.
  27. I. Loshchilov and F. Hutter. Decoupled weight decay regularization. 2017. doi: 10.48550. arXiv preprint ARXIV.1711.05101, 2023.
  28. S. Mascarenhas and M. Agarwal. A comparison between vgg16, vgg19 and resnet50 architecture frameworks for image classification. In 2021 International conference on disruptive technologies for multi-disciplinary research and applications (CENTCON), volume 1, pages 96–99. IEEE, 2021.
  29. Training high-performance low-latency spiking neural networks by differentiation on spike representation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12444–12453, 2022.
  30. Converting static image datasets to spiking neuromorphic datasets using saccades. Frontiers in neuroscience, 9:437, 2015.
  31. Pytorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 32, pages 8024–8035. Curran Associates, Inc., 2019.
  32. N. Perez-Nieves and D. Goodman. Sparse spiking gradient descent. Advances in Neural Information Processing Systems, 34:11795–11808, 2021.
  33. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 652–660, 2017.
  34. Uncertainty-aware aggregation for federated open set domain adaptation. IEEE Transactions on Neural Networks and Learning Systems, 2022.
  35. Eventnet: Asynchronous recursive event processing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3887–3896, 2019.
  36. Hats: Histograms of averaged time surfaces for robust event-based object classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1731–1740, 2018.
  37. Representation learning for event-based visuomotor policies. Advances in Neural Information Processing Systems, 34:4712–4724, 2021.
  38. Uncertainty-aware clustering for unsupervised domain adaptive object re-identification. IEEE Transactions on Multimedia, 2022.
  39. Space-time event clouds for gesture recognition: From rgb cameras to event cameras. In 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 1826–1835. IEEE, 2019.
  40. Visevent: Reliable object tracking via collaboration of frame and event flows. IEEE Transactions on Cybernetics, 2023.
  41. Unleashing the power of cnn and transformer for balanced rgb-event video recognition. arXiv preprint arXiv:2312.11128, 2023.
  42. Sstformer: bridging spiking neural network and memory support transformer for frame-event based recognition. arXiv preprint arXiv:2308.04369, 2023.
  43. Ev-gait: Event-based robust gait recognition using dynamic vision sensors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6358–6367, 2019.
  44. Event-stream representation for human gaits identification using deep neural networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(7):3436–3449, 2021.
  45. Event-stream representation for human gaits identification using deep neural networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021.
  46. Lstm-cnn architecture for human activity recognition. IEEE Access, 8:56855–56866, 2020.
  47. Vmv-gcn: Volumetric multi-view based graph cnn for event stream classification. IEEE Robotics and Automation Letters, 7(2):1976–1983, 2022.
  48. Uncertainty-aware blind image quality assessment in the laboratory and wild. IEEE Transactions on Image Processing, 30:3474–3486, 2021.

Summary

We haven't generated a summary for this paper yet.