Word-level Deep Sign Language Recognition from Video: A New Large-scale Dataset and Methods Comparison (1910.11006v2)

Published 24 Oct 2019 in cs.CV, cs.HC, cs.MM, and cs.NE

Abstract: Vision-based sign language recognition aims at helping deaf people to communicate with others. However, most existing sign language datasets are limited to a small number of words. Due to the limited vocabulary size, models learned from those datasets cannot be applied in practice. In this paper, we introduce a new large-scale Word-Level American Sign Language (WLASL) video dataset, containing more than 2000 words performed by over 100 signers. This dataset will be made publicly available to the research community. To our knowledge, it is by far the largest public ASL dataset to facilitate word-level sign recognition research. Based on this new large-scale dataset, we are able to experiment with several deep learning methods for word-level sign recognition and evaluate their performances in large scale scenarios. Specifically we implement and compare two different models,i.e., (i) holistic visual appearance-based approach, and (ii) 2D human pose based approach. Both models are valuable baselines that will benefit the community for method benchmarking. Moreover, we also propose a novel pose-based temporal graph convolution networks (Pose-TGCN) that models spatial and temporal dependencies in human pose trajectories simultaneously, which has further boosted the performance of the pose-based method. Our results show that pose-based and appearance-based models achieve comparable performances up to 66% at top-10 accuracy on 2,000 words/glosses, demonstrating the validity and challenges of our dataset. Our dataset and baseline deep models are available at \url{https://dxli94.github.io/WLASL/}.

Authors (4)

Dongxu Li (40 papers)
Cristian Rodriguez Opazo (1 paper)
Xin Yu (192 papers)
Hongdong Li (172 papers)

Citations (380)

View on Semantic Scholar

Summary

Summary of "LaTeX Author Guidelines for WACV Proceedings"

The document titled "LaTeX Author Guidelines for WACV Proceedings" serves as a comprehensive style guide for authors preparing submissions for the Winter Conference on Applications of Computer Vision (WACV). This guide is critical for ensuring consistency and clarity across submissions, which facilitates a smooth review process and maintains the quality of the conference proceedings.

Key Elements of the Document

Manuscript Language and Format: All submissions must be composed in English and properly formatted according to the specified guidelines. The use of Times Roman or a similar font is required to maintain a professional appearance throughout the document.
Submission Policies: The document outlines strict policies regarding dual submissions, emphasizing that manuscripts must not be concurrently under review for other conferences or journals. This is crucial to preserve the integrity of the review process and to encourage original contributions.
Paper Length and Layout: There are explicit instructions regarding the permitted length of the manuscripts, the use of columns, and the appropriate margins. The text format includes specifics on paragraph indentation, spacing, and justification, which authors must adhere to ensure that submissions are within the allowed parameters.
Review Process Norms:
- Blind Review: The guidelines further explain the process of anonymizing submissions to enable blind reviews. Proper citation without the use of personal pronouns linked to the authors' own previous work is emphasized to prevent bias during the evaluation process.
- Use of a Printed Ruler: Authors using LaTeX are required to include a ruler on the draft submitted for review. This tool assists reviewers in providing precise feedback but should be removed from the camera-ready version.
Mathematical Expressions: The document mandates detailed and consistent numbering of equations and sections to facilitate reference in future research, thereby reinforcing academic rigor.

Practical Features

The guide addresses various practicalities such as the use of color in figures, the importance of maintaining readability when printed in grayscale, and the necessity of matching the fonts and image dimensions to the text for uniformity.

A notable emphasis is placed on the effective use of LaTeX functionalities, such as \texttt{\textbackslash includegraphics}, to ensure high-quality figure incorporation.

Implications and Future Considerations

The guidelines set a standard for uniform submission practices, which contributes to a more efficient evaluation process and the production of high-caliber conference materials. By stressing the importance of adherence to academic and formatting standards, the document implicitly encourages authors to focus on the substantive quality of their research findings as well as their presentation.

Looking forward, such comprehensive guidelines will continue to play a pivotal role as AI and computer vision research evolve. Adherence to these established norms will ensure that burgeoning AI methodologies and discoveries are effectively communicated and disseminated within the scientific community. This paper primarily functions as a procedural guide and does not introduce new research findings or experimental results; hence it is unlikely to impact theoretical advancements directly. However, as the field progresses, updates to these guidelines may be necessary to accommodate novel forms of data presentation and new software tools emerging from advances in the domain.

Overall, this document serves as an essential resource for authors wishing to contribute to the WACV proceedings, providing a clear framework for preparing manuscripts that meet the conference's stringent academic and presentation standards.

PDF Markdown