Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Enhancing Brazilian Sign Language Recognition through Skeleton Image Representation (2404.19148v1)

Published 29 Apr 2024 in cs.CV

Abstract: Effective communication is paramount for the inclusion of deaf individuals in society. However, persistent communication barriers due to limited Sign Language (SL) knowledge hinder their full participation. In this context, Sign Language Recognition (SLR) systems have been developed to improve communication between signing and non-signing individuals. In particular, there is the problem of recognizing isolated signs (Isolated Sign Language Recognition, ISLR) of great relevance in the development of vision-based SL search engines, learning tools, and translation systems. This work proposes an ISLR approach where body, hands, and facial landmarks are extracted throughout time and encoded as 2-D images. These images are processed by a convolutional neural network, which maps the visual-temporal information into a sign label. Experimental results demonstrate that our method surpassed the state-of-the-art in terms of performance metrics on two widely recognized datasets in Brazilian Sign Language (LIBRAS), the primary focus of this study. In addition to being more accurate, our method is more time-efficient and easier to train due to its reliance on a simpler network architecture and solely RGB data as input.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (29)
  1. World Health Organization, “Deafness and hearing loss.” https://www.who.int/news-room/fact-sheets/detail/deafness-and-hearing-loss, 2024. Accessed: 2024-04-20.
  2. WFD, “World federation of the deaf.” https://wfdeaf.org/our-work/, 2016. Accessed: 2023-06-20.
  3. G. Z. de Castro, R. R. Guerra, and F. G. Guimarães, “Automatic translation of sign language with multi-stream 3d cnn and generation of artificial depth maps,” Expert Systems with Applications, vol. 215, p. 119394, 2023.
  4. A. Núñez-Marcos, O. Perez-de Viñaspre, and G. Labaka, “A survey on sign language machine translation,” Expert Systems with Applications, vol. 213, p. 118993, 2023.
  5. D. Laines, M. Gonzalez-Mendoza, G. Ochoa-Ruiz, and G. Bejarano, “Isolated sign language recognition based on tree structure skeleton images,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 276–284, 2023.
  6. N. C. Tamer and M. Saraçlar, “Keyword search for sign language,” in ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8184–8188, 2020.
  7. S. Hassan, A. A. Amin, A. Gordon, S. Lee, and M. Huenerfauth, “Design and evaluation of hybrid search for american sign language to english dictionaries: Making the most of imperfect sign recognition,” 04 2022.
  8. T. Starner, S. Forbes, M. So, D. Martin, R. Sridhar, G. Deshpande, S. Sepah, S. Shahryar, K. Bhardwaj, T. Kwok, et al., “Popsign asl v1. 0: An isolated american sign language dataset collected via smartphones,” Advances in Neural Information Processing Systems, vol. 36, 2024.
  9. R. Rastgoo, K. Kiani, and S. Escalera, “Sign language recognition: A deep survey,” Expert Systems with Applications, vol. 164, p. 113794, 2021.
  10. L. R. Cerna, E. E. Cardenas, D. G. Miranda, D. Menotti, and G. Camara-Chavez, “A multimodal libras-ufop brazilian sign language dataset of minimal pairs using a microsoft kinect sensor,” Expert Systems with Applications, vol. 167, p. 114179, 2021.
  11. N. Sarhan, J. M. Willruth, and S. Fritnrop, “Pseudodepth-slr: Generating depth data for sign language recognition,” in International Conference on Computer Vision Systems, pp. 51–62, Springer, 2023.
  12. K. Lin, X. Wang, L. Zhu, B. Zhang, and Y. Yang, “Skim: Skeleton-based isolated sign language recognition with part mixing,” IEEE Transactions on Multimedia, vol. 26, pp. 4271–4280, 2024.
  13. S. Aly and W. Aly, “Deeparslr: A novel signer-independent deep learning framework for isolated arabic sign language gestures recognition,” IEEE Access, vol. 8, pp. 83199–83212, 2020.
  14. R. Rastgoo, K. Kiani, and S. Escalera, “Hand pose aware multimodal isolated sign language recognition,” Multimedia Tools and Applications, vol. 80, no. 1, pp. 127–163, 2021.
  15. X. Shen, Z. Zheng, and Y. Yang, “Stepnet: Spatial-temporal part-aware network for isolated sign language recognition,” ACM Transactions on Multimedia Computing, Communications and Applications, 2024.
  16. D. Li, C. Rodriguez, X. Yu, and H. Li, “Word-level deep sign language recognition from video: A new large-scale dataset and methods comparison,” in Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp. 1459–1469, 2020.
  17. N. Sarhan and S. Frintrop, “Transfer learning for videos: from action recognition to sign language recognition,” in 2020 IEEE International Conference on Image Processing (ICIP), pp. 1811–1815, IEEE, 2020.
  18. R. Memmesheimer, S. Häring, N. Theisen, and D. Paulus, “Skeleton-dml: Deep metric learning for skeleton-based one-shot action recognition,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 3702–3710, 2022.
  19. Z. Yang, Y. Li, J. Yang, and J. Luo, “Action recognition with spatio–temporal visual attention on skeleton image sequences,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 29, no. 8, pp. 2405–2415, 2018.
  20. Z. Cao, G. Hidalgo, T. Simon, S.-E. Wei, and Y. Sheikh, “Openpose: Realtime multi-person 2d pose estimation using part affinity fields,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 1, pp. 172–186, 2021.
  21. T. M. Rezende, S. G. M. Almeida, and F. G. Guimarães, “Development and validation of a brazilian sign language database for human gesture recognition,” Neural Comput. Appl., vol. 33, p. 10449–10467, aug 2021.
  22. P. Chen and Q. Shen, “Research on table tennis swing recognition based on lightweight openpose,” in 2023 16th International Conference on Advanced Computer Theory and Engineering (ICACTE), pp. 207–212, IEEE, 2023.
  23. Y. Saiki, T. Kabata, T. Ojima, Y. Kajino, D. Inoue, T. Ohmori, J. Yoshitani, T. Ueno, Y. Yamamuro, A. Taninaka, et al., “Reliability and validity of openpose for measuring hip-knee-ankle angle in patients with knee osteoarthritis,” Scientific Reports, vol. 13, no. 1, p. 3297, 2023.
  24. A. W. Emanuel, P. Mudjihartono, and J. A. Nugraha, “Snapshot-based human action recognition using openpose and deep learning,” Snapshot-Based Human Action Recognition using OpenPose and Deep Learning, vol. 48, no. 4, pp. 2–8, 2021.
  25. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778, 2016.
  26. D. R. da Silva, T. M. U. de Araújo, T. G. do Rêgo, M. A. C. Brandão, and L. M. G. Gonçalves, “A multiple stream architecture for the recognition of signs in brazilian sign language in the context of health,” Multimedia Tools and Applications, vol. 83, no. 7, pp. 19767–19785, 2024.
  27. W. Lobato Passos, G. Araujo, J. Gois, and A. Lima, “A gait energy image-based system for brazilian sign language recognition,” IEEE Transactions on Circuits and Systems I: Regular Papers, vol. PP, pp. 1–11, 06 2021.
  28. M. Contributors, “Openmmlab pose estimation toolbox and benchmark.” https://github.com/open-mmlab/mmpose, 2020.
  29. C. Lugaresi, J. Tang, H. Nash, C. McClanahan, E. Uboweja, M. Hays, F. Zhang, C.-L. Chang, M. G. Yong, J. Lee, et al., “Mediapipe: A framework for building perception pipelines,” arXiv preprint arXiv:1906.08172, 2019.
Citations (2)

Summary

We haven't generated a summary for this paper yet.