Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

G3R: Generating Rich and Fine-grained mmWave Radar Data from 2D Videos for Generalized Gesture Recognition (2404.14934v1)

Published 23 Apr 2024 in cs.MM, cs.CV, and cs.HC

Abstract: Millimeter wave radar is gaining traction recently as a promising modality for enabling pervasive and privacy-preserving gesture recognition. However, the lack of rich and fine-grained radar datasets hinders progress in developing generalized deep learning models for gesture recognition across various user postures (e.g., standing, sitting), positions, and scenes. To remedy this, we resort to designing a software pipeline that exploits wealthy 2D videos to generate realistic radar data, but it needs to address the challenge of simulating diversified and fine-grained reflection properties of user gestures. To this end, we design G3R with three key components: (i) a gesture reflection point generator expands the arm's skeleton points to form human reflection points; (ii) a signal simulation model simulates the multipath reflection and attenuation of radar signals to output the human intensity map; (iii) an encoder-decoder model combines a sampling module and a fitting module to address the differences in number and distribution of points between generated and real-world radar data for generating realistic radar data. We implement and evaluate G3R using 2D videos from public data sources and self-collected real-world radar data, demonstrating its superiority over other state-of-the-art approaches for gesture recognition.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (65)
  1. K. Deng, D. Zhao, Q. Han, S. Wang, Z. Zhang, A. Zhou, and H. Ma, “Geryon: Edge assisted real-time and robust object detection on drones via mmwave radar and camera fusion,” Proc. of ACM IMWUT, vol. 6, no. 3, pp. 1–27, 2022.
  2. K. Deng, D. Zhao, Q. Han, Z. Zhang, S. Wang, and H. Ma, “Global-local feature enhancement network for robust object detection using mmwave radar and camera,” in Proc. of IEEE ICASSP, 2022, pp. 4708–4712.
  3. H. Liu, Y. Wang, A. Zhou, H. He, W. Wang, K. Wang, P. Pan, Y. Lu, L. Liu, and H. Ma, “Real-time arm gesture recognition in smart home scenarios via millimeter wave sensing,” Proc. of ACM IMWUT, vol. 4, no. 4, pp. 1–28, 2020.
  4. H. Liu, K. Cui, K. Hu, Y. Wang, A. Zhou, L. Liu, and H. Ma, “Mtranssee: Enabling environment-independent mmwave sensing based gesture recognition via transfer learning,” Proc. of ACM IMWUT, vol. 6, no. 1, pp. 1–28, 2022.
  5. K. Ahuja, Y. Jiang, M. Goel, and C. Harrison, “Vid2doppler: Synthesizing doppler radar data from videos for training privacy-preserving activity recognition,” in Proc. of ACM CHI, 2021, pp. 1–10.
  6. K. Deng, D. Zhao, Q. Han, Z. Zhang, S. Wang, A. Zhou, and H. Ma, “Midas: Generating mmwave radar data from videos for training pervasive and privacy-preserving human sensing tasks,” Proc. of ACM IMWUT, vol. 7, no. 1, pp. 1–26, 2023.
  7. X. Zhang, Z. Li, and J. Zhang, “Synthesized millimeter-waves for human motion sensing,” in Proc. of ACM SenSys, 2022, pp. 377–390.
  8. H. Liu, A. Zhou, Z. Dong, Y. Sun, J. Zhang, L. Liu, H. Ma, J. Liu, and N. Yang, “M-gesture: Person-independent real-time in-air gesture recognition using commodity millimeter wave radar,” IEEE Internet of Things Journal, vol. 9, no. 5, pp. 3397–3415, 2021.
  9. S. Palipana, D. Salami, L. A. Leiva, and S. Sigg, “Pantomime: Mid-air gesture recognition with sparse millimeter-wave radar point clouds,” Proc. of ACM IMWUT, vol. 5, no. 1, pp. 1–27, 2021.
  10. X. Shen, H. Zheng, X. Feng, and J. Hu, “Ml-hgr-net: A meta-learning network for fmcw radar based hand gesture recognition,” IEEE Sensors Journal, vol. 22, no. 11, pp. 10 808–10 817, 2022.
  11. K. Ling, H. Dai, Y. Liu, A. X. Liu, W. Wang, and Q. Gu, “Ultragesture: Fine-grained gesture sensing and recognition,” IEEE Transactions on Mobile Computing, vol. 21, no. 7, pp. 2620–2636, 2020.
  12. A. Waghmare, Y. Ben Taleb, I. Chatterjee, A. Narendra, and S. Patel, “Z-ring: Single-point bio-impedance sensing for gesture, touch, object and user recognition,” in Proc. of ACM CHI, 2023, pp. 1–18.
  13. Z. Xia and F. Xu, “Time-space dimension reduction of millimeter-wave radar point-clouds for smart-home hand-gesture recognition,” IEEE Sensors Journal, vol. 22, no. 5, pp. 4425–4437, 2022.
  14. W. Chen, S. Lin, E. Thompson, and J. Stankovic, “Sensecollect: We need efficient ways to collect on-body sensor-based human activity data!” Proc. of ACM IMWUT, vol. 5, no. 3, pp. 1–27, 2021.
  15. H. Kuehne, H. Jhuang, E. Garrote, T. Poggio, and T. Serre, “Hmdb: a large video database for human motion recognition,” in Proc. of IEEE ICCV, 2011, pp. 2556–2563.
  16. S. Abu-El-Haija, N. Kothari, J. Lee, P. Natsev, G. Toderici, B. Varadarajan, and S. Vijayanarasimhan, “Youtube-8m: A large-scale video classification benchmark,” arXiv preprint arXiv:1609.08675, 2016.
  17. A. G. Perera, Y. Wei Law, and J. Chahl, “Uav-gesture: A dataset for uav control and gesture recognition,” in Proc. of ECCV Workshops, 2018, pp. 0–0.
  18. K. Soomro, A. R. Zamir, and M. Shah, “Ucf101: A dataset of 101 human actions classes from videos in the wild,” arXiv preprint arXiv:1212.0402, 2012.
  19. H. J. Escalante, V. Ponce-López, J. Wan, M. A. Riegler, B. Chen, A. Clapés, S. Escalera, I. Guyon, X. Baró, P. Halvorsen et al., “Chalearn joint contest on multimedia challenges beyond visual analysis: An overview,” in Proc. of IEEE ICPR, 2016, pp. 67–73.
  20. H. Kwon, C. Tong, H. Haresamudram, Y. Gao, G. D. Abowd, N. D. Lane, and T. Ploetz, “Imutube: Automatic extraction of virtual on-body accelerometry from video for human activity recognition,” Proc. of ACM IMWUT, vol. 4, no. 3, pp. 1–29, 2020.
  21. H. Kwon, B. Wang, G. D. Abowd, and T. Plötz, “Approaching the real-world: Supporting activity recognition training with virtual imu data,” Proc. of ACM IMWUT, vol. 5, no. 3, pp. 1–32, 2021.
  22. D. Liang and E. Thomaz, “Audio-based activities of daily living (adl) recognition with large-scale acoustic embeddings from online videos,” Proc. of ACM IMWUT, vol. 3, no. 1, pp. 1–18, 2019.
  23. P. S. Santhalingam, P. Pathak, H. Rangwala, and J. Kosecka, “Synthetic smartwatch imu data generation from in-the-wild asl videos,” Proc. of ACM IMWUT, vol. 7, no. 2, pp. 1–34, 2023.
  24. Y. Lin and J. Le Kernec, “Performance analysis of classification algorithms for activity recognition using micro-doppler feature,” in Proc. of IEEE CIS, 2017, pp. 480–483.
  25. M. S. Seyfioglu, B. Erol, S. Z. Gurbuz, and M. G. Amin, “Diversified radar micro-doppler simulations as training data for deep residual neural networks,” in Proc. of IEEE radarConf, 2018, pp. 0612–0617.
  26. B. Erol and S. Z. Gurbuz, “A kinect-based human micro-doppler simulator,” IEEE AESM, vol. 30, no. 5, pp. 6–17, 2015.
  27. J. Li, A. Shrestha, J. Le Kernec, and F. Fioranelli, “From kinect skeleton data to hand gesture recognition with radar,” The Journal of Engineering, vol. 2019, no. 20, pp. 6914–6919, 2019.
  28. B. Erol, S. Z. Gurbuz, and M. G. Amin, “Gan-based synthetic radar micro-doppler augmentations for improved human activity recognition,” in Proc. of IEEE radarConf, 2019, pp. 1–5.
  29. M. M. Rahman, S. Z. Gurbuz, and M. G. Amin, “Physics-aware design of multi-branch gan for human rf micro-doppler signature synthesis,” in Proc. of IEEE radarConf, 2021, pp. 1–6.
  30. H. Rohling, “Radar cfar thresholding in clutter and multiple target situations,” IEEE TAES, no. 4, pp. 608–621, 1983.
  31. M. Ester, H.-P. Kriegel, J. Sander, X. Xu et al., “A density-based algorithm for discovering clusters in large spatial databases with noise,” in Proc. of ACM SIGKDD, vol. 96, no. 34, 1996, pp. 226–231.
  32. Statista. (2020, February) Hours of video uploaded to youtube every minute as of february. [Online]. Available: https://www.statista.com/statistics/259477/hours-of-video-uploaded-to-youtube-every-minute/
  33. X. Zhang, Y. Chen, B. Zhu, J. Wang, and M. Tang, “Blended grammar network for human parsing,” in Proc. of ECCV, 2020, pp. 189–205.
  34. X. Zhang, Y. Chen, M. Tang, J. Wang, X. Zhu, and Z. Lei, “Human parsing with part-aware relation modeling,” IEEE TMM, 2022.
  35. K. Liu, O. Choi, J. Wang, and W. Hwang, “Cdgnet: Class distribution guided network for human parsing,” in Proc. of IEEE CVPR, 2022, pp. 4473–4482.
  36. Y. Cai, Z. Wang, Z. Luo, B. Yin, A. Du, H. Wang, X. Zhang, X. Zhou, E. Zhou, and J. Sun, “Learning delicate local representations for multi-person pose estimation,” in Proc. of ECCV, 2020, pp. 455–472.
  37. Y. Sun, W. Liu, Q. Bao, Y. Fu, T. Mei, and M. J. Black, “Putting people in their place: Monocular regression of 3d people in depth,” in Proc. of IEEE CVPR, 2022, pp. 13 243–13 252.
  38. S. Guan, J. Xu, M. Z. He, Y. Wang, B. Ni, and X. Yang, “Out-of-domain human mesh reconstruction via dynamic bilevel online adaptation,” IEEE TPAMI, vol. 45, no. 4, pp. 5070–5086, 2022.
  39. S. F. Bhat, R. Birkl, D. Wofk, P. Wonka, and M. Müller, “Zoedepth: Zero-shot transfer by combining relative and metric depth,” arXiv preprint arXiv:2302.12288, 2023.
  40. T. Whitted, “An improved illumination model for shaded display,” in ACM Siggraph 2005 Courses, 2005, pp. 4–es.
  41. S. Yue, H. He, P. Cao, K. Zha, M. Koizumi, and D. Katabi, “Cornerradar: Rf-based indoor localization around corners,” Proc. of ACM IMWUT, vol. 6, no. 1, pp. 1–24, 2022.
  42. N. Scheiner, F. Kraus, F. Wei, B. Phan, F. Mannan, N. Appenrodt, W. Ritter, J. Dickmann, K. Dietmayer, B. Sick et al., “Seeing around street corners: Non-line-of-sight detection and tracking in-the-wild using doppler radar,” in Proc. of IEEE CVPR, 2020, pp. 2068–2077.
  43. H. Ling, R.-C. Chou, and S.-W. Lee, “Shooting and bouncing rays: Calculating the rcs of an arbitrarily shaped cavity,” IEEE TAP, vol. 37, no. 2, pp. 194–205, 1989.
  44. S. Rao, “Introduction to mmwave sensing: Fmcw radars,” Texas Instruments (TI) mmWave Training Series, pp. 1–11, 2017.
  45. C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “Pointnet: Deep learning on point sets for 3d classification and segmentation,” in Proc. of IEEE CVPR, 2017, pp. 652–660.
  46. Y. Wang, Y. Sun, Z. Liu, S. E. Sarma, M. M. Bronstein, and J. M. Solomon, “Dynamic graph cnn for learning on point clouds,” ACM TOG, vol. 38, no. 5, pp. 1–12, 2019.
  47. H. Fan, H. Su, and L. J. Guibas, “A point set generation network for 3d object reconstruction from a single image,” in Proc. of IEEE CVPR, 2017, pp. 605–613.
  48. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in Proc. of ICLR, 2015.
  49. X. Zhu, P.-Y. Huang, J. Liang, C. M. de Melo, and A. G. Hauptmann, “Stmt: A spatial-temporal mesh transformer for mocap-based action recognition,” in Proc. of IEEE CVPR, 2023, pp. 1526–1536.
  50. M. Barnachon, S. Bouakaz, B. Boufama, and E. Guillou, “Ongoing human action recognition with motion capture,” Pattern Recognition, vol. 47, no. 1, pp. 238–247, 2014.
  51. V. Villani, C. Secchi, M. Lippi, and L. Sabattini, “A general pipeline for online gesture recognition in human–robot interaction,” IEEE Transactions on Human-Machine Systems, vol. 53, no. 2, pp. 315–324, 2023.
  52. A. Sharma, C. Salchow-Hömmen, V. S. Mollyn, A. S. Nittala, M. A. Hedderich, M. Koelle, T. Seel, and J. Steimle, “Sparseimu: Computational design of sparse imu layouts for sensing fine-grained finger microgestures,” ACM TOCHI, vol. 30, no. 3, pp. 1–40, 2023.
  53. Y. Bu, L. Xie, Y. Gong, C. Wang, L. Yang, J. Liu, and S. Lu, “Rf-dial: An rfid-based 2d human-computer interaction via tag array,” in Proc. of IEEE INFOCOM, 2018, pp. 837–845.
  54. H. Li, C. Ye, and A. P. Sample, “Idsense: A human object interaction detection system based on passive uhf rfid,” in Proc. of ACM CHI, 2015, pp. 2555–2564.
  55. D. Li, J. Liu, S. I. Lee, and J. Xiong, “Room-scale hand gesture recognition using smart speakers,” in Proc. of ACM SenSys, 2022, pp. 462–475.
  56. C. Xu, B. Zhou, G. Krishnan, and S. Nayar, “Ao-finger: Hands-free fine-grained finger gesture recognition via acoustic-optic sensor fusing,” in Proc. of ACM CHI, 2023, pp. 1–14.
  57. S. Mukherjee, S. A. Ahmed, D. P. Dogra, S. Kar, and P. P. Roy, “Fingertip detection and tracking for recognition of air-writing in videos,” ESWA, vol. 136, pp. 217–229, 2019.
  58. M. S. Alam, K.-C. Kwon, and N. Kim, “Implementation of a character recognition system based on finger-joint tracking using a depth camera,” IEEE Transactions on Human-Machine Systems, vol. 51, no. 3, pp. 229–241, 2021.
  59. Y. Zhang, Y. Zheng, K. Qian, G. Zhang, Y. Liu, C. Wu, and Z. Yang, “Widar3. 0: Zero-effort cross-domain gesture recognition with wi-fi,” IEEE TPAMI, vol. 44, no. 11, pp. 8671–8688, 2021.
  60. C. Feng, N. Wang, Y. Jiang, X. Zheng, K. Li, Z. Wang, and X. Chen, “Wi-learner: Towards one-shot learning for cross-domain wi-fi based gesture recognition,” Proc. of ACM IMWUT, vol. 6, no. 3, pp. 1–27, 2022.
  61. R. Xiao, J. Liu, J. Han, and K. Ren, “Onefi: One-shot recognition for unseen gesture via cots wifi,” in Proc. of ACM SenSys, 2021, pp. 206–219.
  62. B. Planche, Z. Wu, K. Ma, S. Sun, S. Kluckner, O. Lehmann, T. Chen, A. Hutter, S. Zakharov, H. Kosch et al., “Depthsynth: Real-time realistic synthetic data generation from cad models for 2.5 d recognition,” in Proc. of IEEE 3DV, 2017, pp. 1–10.
  63. M. M. Rahman and S. Z. Gurbuz, “Self-supervised contrastive learning for radar-based human activity recognition,” in Proc. of IEEE RadarConf, 2023, pp. 1–6.
  64. M. M. Rahman, S. Z. Gurbuz, and M. G. Amin, “Physics-aware generative adversarial networks for radar-based human activity recognition,” IEEE TAES, 2022.
  65. H. Du, R. Zhang, D. Niyato, J. Kang, Z. Xiong, D. I. Kim, X. S. Shen, and H. V. Poor, “Exploring collaborative distributed diffusion-based ai-generated content (aigc) in wireless networks,” IEEE Network, no. 99, pp. 1–8, 2023.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets