Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Sign Language Recognition Based On Facial Expression and Hand Skeleton (2407.02241v1)

Published 2 Jul 2024 in cs.CV

Abstract: Sign language is a visual language used by the deaf and dumb community to communicate. However, for most recognition methods based on monocular cameras, the recognition accuracy is low and the robustness is poor. Even if the effect is good on some data, it may perform poorly in other data with different interference due to the inability to extract effective features. To solve these problems, we propose a sign language recognition network that integrates skeleton features of hands and facial expression. Among this, we propose a hand skeleton feature extraction based on coordinate transformation to describe the shape of the hand more accurately. Moreover, by incorporating facial expression information, the accuracy and robustness of sign language recognition are finally improved, which was verified on A Dataset for Argentinian Sign Language and SEU's Chinese Sign Language Recognition Database (SEUCSLRD).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (10)
  1. K. Wang, X. Peng, J. Yang, S. Lu, and Y. Qiao, “Suppressing uncertainties for large-scale facial expression recognition,” CoRR, vol. abs/2002.10392, 2020.
  2. D. M. Madhiarasan and P. P. P. Roy, “A comprehensive review of sign language recognition: Different types, modalities, and datasets,” 2022.
  3. Q. De Smedt, H. Wannous, and J.-P. Vandeborre, “Skeleton-based dynamic hand gesture recognition,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2016, pp. 1206–1214.
  4. R. Cui, A. Zhu, S. Zhang, and G. Hua, “Multi-source learning for skeleton -based action recognition using deep lstm networks,” in 2018 24th International Conference on Pattern Recognition (ICPR), 2018, pp. 547–552.
  5. K. Cho, B. van Merrienboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, “Learning phrase representations using rnn encoder-decoder for statistical machine translation,” 2014.
  6. S. Masood, A. Srivastava, H. C. Thuwal, and M. Ahmad, “Real-time sign language gesture (word) recognition from video sequences using cnn and rnn,” in Intelligent Engineering Informatics, V. Bhateja, C. A. Coello Coello, S. C. Satapathy, and P. K. Pattnaik, Eds.   Singapore: Springer Singapore, 2018, pp. 623–632.
  7. J. Liu, A. Shahroudy, D. Xu, and G. Wang, “Spatio-temporal LSTM with trust gates for 3d human action recognition,” CoRR, vol. abs/1607.07043, 2016.
  8. Q. Xiao, M. Qin, and Y. Yin, “Skeleton-based chinese sign language recognition and generation for bidirectional communication between deaf and hearing people,” Neural Networks, vol. 125, pp. 41–55, 2020.
  9. C. Lugaresi, J. Tang, H. Nash, C. McClanahan, E. Uboweja, M. Hays, F. Zhang, C.-L. Chang, M. G. Yong, J. Lee, W.-T. Chang, W. Hua, M. Georg, and M. Grundmann, “Mediapipe: A framework for building perception pipelines,” 2019.
  10. F. Ronchetti, F. Quiroga, C. Estrebou, L. Lanzarini, and A. Rosete, “Lsa64: A dataset of argentinian sign language,” XX II Congreso Argentino de Ciencias de la Computación (CACIC), 2016.

Summary

We haven't generated a summary for this paper yet.