A New Dataset for End-to-End Sign Language Translation: The Greek Elementary School Dataset (2310.04753v1)
Abstract: Automatic Sign Language Translation (SLT) is a research avenue of great societal impact. End-to-End SLT facilitates the interaction of Hard-of-Hearing (HoH) with hearing people, thus improving their social life and opportunities for participation in social life. However, research within this frame of reference is still in its infancy, and current resources are particularly limited. Existing SLT methods are either of low translation ability or are trained and evaluated on datasets of restricted vocabulary and questionable real-world value. A characteristic example is Phoenix2014T benchmark dataset, which only covers weather forecasts in German Sign Language. To address this shortage of resources, we introduce a newly constructed collection of 29653 Greek Sign Language video-translation pairs which is based on the official syllabus of Greek Elementary School. Our dataset covers a wide range of subjects. We use this novel dataset to train recent state-of-the-art Transformer-based methods widely used in SLT research. Our results demonstrate the potential of our introduced dataset to advance SLT research by offering a favourable balance between usability and real-world value.
- A comprehensive study on sign language recognition methods. arXiv preprint arXiv:2007.12530, 2(2), 2020.
- SIGNUM database: Video corpus for signer-independent continuous sign language recognition. In Philippe Dreuw, Eleni Efthimiou, Thomas Hanke, Trevor Johnston, Gregorio Martínez Ruiz, and Adam Schembri, editors, Proceedings of the LREC2010 4th Workshop on the Representation and Processing of Sign Languages: Corpora and Sign Language Technologies, pages 243–246, Valletta, Malta, May 2010. European Language Resources Association (ELRA).
- A new extension of fdosm based on pythagorean fuzzy environment for evaluating and benchmarking sign language recognition systems. Neural Computing and Applications, pages 1–19, 2022.
- Bbc-oxford british sign language dataset. arXiv preprint arXiv:2111.03635, 2021.
- The american sign language lexicon video dataset. In 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pages 1–8, 2008.
- Sign pose-based transformer for word-level sign language recognition. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 182–191, 2022.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
- Neural sign language translation. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7784–7793, 2018.
- BosphorusSign: a Turkish sign language recognition corpus in health and finance domains. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), pages 1383–1388, 2016.
- Multi-channel transformers for multi-articulatory sign language translation. In Computer Vision–ECCV 2020 Workshops: Glasgow, UK, August 23–28, 2020, Proceedings, Part IV 16, pages 301–319. Springer, 2020.
- Sign language transformers: Joint end-to-end sign language recognition and translation. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020, pages 10020–10030. IEEE, 2020.
- Content4all open research sign language translation datasets. In 2021 16th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2021), pages 1–5. IEEE, 2021.
- Realtime multi-person 2d pose estimation using part affinity fields. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7291–7299, 2017.
- Devisign: dataset and evaluation for 3d sign language recognition. Technical report, Beijing, Tech. Rep, 2015.
- Sign Language Recognition Using Sub-units, pages 89–118. Springer International Publishing, Cham, 2017.
- Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
- How2sign: a large-scale multimodal dataset for continuous american sign language. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2735–2744, 2021.
- Search-by-example in multilingual sign language databases. In Proceedings of the Second International Workshop on Sign Language Translation and Avatar Technology (SLTAT), Dundee, Scotland, Oct. 23 2011.
- Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the thirteenth international conference on artificial intelligence and statistics, pages 249–256. JMLR Workshop and Conference Proceedings, 2010.
- Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision, pages 1026–1034, 2015.
- Categorical reparameterization with gumbel-softmax, 2017.
- Prior knowledge and memory enriched transformer for sign language translation. In Findings of the Association for Computational Linguistics: ACL 2022, pages 3766–3775, 2022.
- Stochastic deep networks with linear competing units for model-agnostic meta-learning. In International Conference on Machine Learning, pages 10586–10597. PMLR, 2022.
- Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
- Joey NMT: A minimalist NMT toolkit for novices. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations, pages 109–114, Hong Kong, China, Nov. 2019. Association for Computational Linguistics.
- A real time system for dynamic hand gesture recognition with a depth sensor. In 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO), pages 1975–1979, 2012.
- Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942, 2019.
- Word-level deep sign language recognition from video: A new large-scale dataset and methods comparison. In The IEEE Winter Conference on Applications of Computer Vision, pages 1459–1469, 2020.
- Sign language recognition using sequential pattern trees. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Providence, Rhode Island, USA, June 16 – 21 2012.
- BosphorusSign22k Sign Language Recognition Dataset. In Proceedings of the LREC2020 9th Workshop on the Representation and Processing of Sign Languages: Sign Language Resources in the Service of the Language Community, Technological Challenges and Application Perspectives, 2020.
- Nonparametric bayesian deep networks with local competition. In International Conference on Machine Learning, pages 4980–4988. PMLR, 2019.
- Stochastic local winner-takes-all networks enable profound adversarial robustness. In Bayesian Deep Learning NeurIPS workshop, 2021.
- Competing mutual information constraints with stochastic competition-based activations for learning diversified representations. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 7931–7940, 2022.
- Local competition and stochastictity for adversarial robustness in deep learning. In Proc. AISTATS, 2021.
- Pytorch: An imperative style, high-performance deep learning library. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems 32, pages 8024–8035. Curran Associates, Inc., 2019.
- Lsa64: A dataset of argentinian sign language. XX II Congreso Argentino de Ciencias de la Computación (CACIC), 2016.
- Open-domain sign language translation learned from online video. arXiv preprint arXiv:2205.12870, 2022.
- Attention is all you need. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 30, pages 5998–6008. Curran Associates, Inc., 2017.
- Stochastic transformer networks with linear competing units: Application to end-to-end sl translation. In Proc. ICCV, 2021.
- Robust 3d action recognition with random occupancy patterns. In Andrew Fitzgibbon, Svetlana Lazebnik, Pietro Perona, Yoichi Sato, and Cordelia Schmid, editors, Computer Vision – ECCV 2012, pages 872–885, Berlin, Heidelberg, 2012. Springer Berlin Heidelberg.
- Pisltrc: Position-informed sign language transformer with content-aware convolution. IEEE Transactions on Multimedia, 24:3908–3919, 2021.
- Better sign language translation with STMC-transformer. In Proceedings of the 28th International Conference on Computational Linguistics, pages 5975–5989, Barcelona, Spain (Online), Dec. 2020. International Committee on Computational Linguistics.
- Improving sign language translation with monolingual data by sign back-translation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1316–1325, 2021.
- C2slr: Consistency-enhanced continuous sign language recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5131–5140, 2022.
- Andreas Voskou (6 papers)
- Konstantinos P. Panousis (14 papers)
- Harris Partaourides (6 papers)
- Kyriakos Tolias (4 papers)
- Sotirios Chatzis (23 papers)