Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MaskFi: Unsupervised Learning of WiFi and Vision Representations for Multimodal Human Activity Recognition (2402.19258v1)

Published 29 Feb 2024 in cs.CV

Abstract: Human activity recognition (HAR) has been playing an increasingly important role in various domains such as healthcare, security monitoring, and metaverse gaming. Though numerous HAR methods based on computer vision have been developed to show prominent performance, they still suffer from poor robustness in adverse visual conditions in particular low illumination, which motivates WiFi-based HAR to serve as a good complementary modality. Existing solutions using WiFi and vision modalities rely on massive labeled data that are very cumbersome to collect. In this paper, we propose a novel unsupervised multimodal HAR solution, MaskFi, that leverages only unlabeled video and WiFi activity data for model training. We propose a new algorithm, masked WiFi-vision modeling (MI2M), that enables the model to learn cross-modal and single-modal features by predicting the masked sections in representation learning. Benefiting from our unsupervised learning procedure, the network requires only a small amount of annotated data for finetuning and can adapt to the new environment with better performance. We conduct extensive experiments on two WiFi-vision datasets collected in-house, and our method achieves human activity recognition and human identification in terms of both robustness and accuracy.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (45)
  1. Activity recognition using inertial sensing for healthcare, wellbeing and sports applications: A survey. In 23th International conference on architecture of computing systems 2010, pages 1–10. VDE, 2010.
  2. Beit: Bert pre-training of image transformers. arXiv preprint arXiv:2106.08254, 2021.
  3. Vision-based human activity recognition: a survey. Multimedia Tools and Applications, 79:30509–30555, 2020.
  4. Iot wearable sensor and deep learning: An integrated approach for personalized human activity recognition in a smart home environment. IEEE Internet of Things Journal, 6(5):8553–8562, 2019.
  5. Vision based human activity recognition: a review. In Advances in Computational Intelligence Systems: Contributions Presented at the 16th UK Workshop on Computational Intelligence, September 7–9, 2016, Lancaster, UK, pages 341–371. Springer, 2017.
  6. Deep learning for sensor-based human activity recognition: Overview, challenges, and opportunities. ACM Computing Surveys (CSUR), 54(4):1–40, 2021.
  7. Darklight networks for action recognition in the dark. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 846–852, 2021.
  8. Wifi csi based passive human activity recognition using attention based blstm. IEEE Transactions on Mobile Computing, 18(11):2714–2724, 2018.
  9. Sensor-based and vision-based human activity recognition: A comprehensive survey. Pattern Recognition, 108:107561, 2020.
  10. Gaitfi: Robust device-free human identification via wifi and vision multimodal learning. IEEE Internet of Things Journal, 10(1):625–636, 2022.
  11. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
  12. Learning spatio-temporal features with 3d residual networks for action recognition. In Proceedings of the IEEE international conference on computer vision workshops, pages 3154–3160, 2017.
  13. Channel-equalization-har: a light-weight convolutional neural network for wearable sensor based human activity recognition. IEEE Transactions on Mobile Computing, 2022.
  14. Pixel-bert: Aligning image pixels with text by deep multi-modal transformers. arXiv preprint arXiv:2004.00849, 2020.
  15. Vision transformer and deep sequence learning for human activity recognition in surveillance videos. Computational Intelligence and Neuroscience, 2022, 2022.
  16. Collossl: Collaborative self-supervised learning for human activity recognition. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 6(1):1–28, 2022.
  17. Categorical reparameterization with gumbel-softmax. arXiv preprint arXiv:1611.01144, 2016.
  18. Sifall: Practical online fall detection with rf sensing. In Proceedings of the 20th ACM Conference on Embedded Networked Sensor Systems, pages 563–577, 2022.
  19. Large-scale video classification with convolutional neural networks. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pages 1725–1732, 2014.
  20. Contrastive self-supervised learning for sensor-based human activity recognition. In 2021 IEEE International Joint Conference on Biometrics (IJCB), pages 1–8. IEEE, 2021.
  21. Two-stream convolution augmented transformer for human activity recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 286–293, 2021.
  22. Wi-motion: A robust human activity recognition using wifi signals. IEEE Access, 7:153287–153299, 2019.
  23. Binarized neural network for edge intelligence of sensor-based human activity recognition. IEEE Transactions on Mobile Computing, 2021.
  24. Csi-based human activity recognition using convolutional neural networks. In 2021 11th International Conference on Computer Engineering and Knowledge (ICCKE), pages 7–12. IEEE, 2021.
  25. Zero-shot text-to-image generation. In International Conference on Machine Learning, pages 8821–8831. PMLR, 2021.
  26. Multi-task self-supervised learning for human activity detection. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 3(2):1–30, 2019.
  27. A review of unsupervised feature selection methods. Artificial Intelligence Review, 53(2):907–948, 2020.
  28. Attention is all you need. Advances in neural information processing systems, 30, 2017.
  29. Multimodal csi-based human activity recognition using gans. IEEE Internet of Things Journal, 8(24):17345–17355, 2021.
  30. Airfi: empowering wifi-based passive human gesture recognition to unseen environment via domain generalization. IEEE Transactions on Mobile Computing, 2022.
  31. Joint activity recognition and indoor localization with wifi fingerprints. IEEE Access, 7:80058–80068, 2019.
  32. Precise power delay profiling with commodity wifi. In Proceedings of the 21st Annual international conference on Mobile Computing and Networking, pages 53–64, 2015.
  33. Going deeper into recognizing actions in dark environments: A comprehensive benchmark study. arXiv preprint arXiv:2202.09545, 2022.
  34. Csitime: Privacy-preserving human activity recognition using wifi channel state information. Neural Networks, 146:11–21, 2022.
  35. Wiact: A passive wifi-based human activity recognition system. IEEE Sensors Journal, 20(1):296–305, 2019.
  36. Sensefi: A library and benchmark on deep-learning-empowered wifi human sensing. Patterns, 4(3), 2023.
  37. Autofi: Towards automatic wifi human sensing via geometric self-supervised learning. IEEE Internet of Things Journal, 2022.
  38. Mm-fi: Multi-modal non-intrusive 4d human dataset for versatile wireless sensing. arXiv preprint arXiv:2305.10345, 2023.
  39. Carefi: Sedentary behavior monitoring system via commodity wifi infrastructures. IEEE Transactions on Vehicular Technology, 67(8):7620–7629, 2018.
  40. Device-free occupant activity sensing using wifi-enabled iot devices for smart homes. IEEE Internet of Things Journal, 5(5):3991–4002, 2018.
  41. Securesense: Defending adversarial attack for secure device-free human activity recognition. IEEE Transactions on Mobile Computing, 2022.
  42. Data augmentation and dense-lstm for human activity recognition using wifi signal. IEEE Internet of Things Journal, 8(6):4628–4641, 2020.
  43. A location-independent human activity recognition method based on csi: System, architecture, implementation. IEEE Transactions on Mobile Computing, 2023.
  44. Human activity recognition across scenes and categories based on csi. IEEE Transactions on Mobile Computing, 2020.
  45. Wifi and vision multimodal learning for accurate and robust device-free human activity recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pages 0–0, 2019.
Citations (2)

Summary

We haven't generated a summary for this paper yet.