Evaluating Self-Supervised Speech Representations for Indigenous American Languages (2310.03639v2)
Abstract: The application of self-supervision to speech representation learning has garnered significant interest in recent years, due to its scalability to large amounts of unlabeled data. However, much progress, both in terms of pre-training and downstream evaluation, has remained concentrated in monolingual models that only consider English. Few models consider other languages, and even fewer consider indigenous ones. In our submission to the New Language Track of the ASRU 2023 ML-SUPERB Challenge, we present an ASR corpus for Quechua, an indigenous South American Language. We benchmark the efficacy of large SSL models on Quechua, along with 6 other indigenous languages such as Guarani and Bribri, on low-resource ASR. Our results show surprisingly strong performance by state-of-the-art SSL models, showing the potential generalizability of large-scale models to real-world data.
- Proceedings of the 4th Workshop on Technologies for MT of Low Resource Languages (LoResMT2021), Virtual, Aug. 2021. Association for Machine Translation in the Americas.
- Proceedings of the 15th biennial conference of the Association for Machine Translation in the Americas (Workshop 2: Corpus Generation and Corpus Augmentation for Machine Translation). Association for Machine Translation in the Americas, Sept. 2022.
- Proceedings of the First Workshop on Natural Language Processing for Indigenous Languages of the Americas, Online, June 2021. Association for Computational Linguistics.
- Proceedings of the Workshop on Natural Language Processing for Indigenous Languages of the Americas (AmericasNLP), Toronto, Canada, July 2023. Association for Computational Linguistics.
- “Masakhane–machine translation for africa,” arXiv preprint arXiv:2003.11529, 2020.
- “Participatory research for low-resourced machine translation: A case study in African languages,” in Findings of the Association for Computational Linguistics: EMNLP 2020, Online, Nov. 2020, pp. 2144–2160, Association for Computational Linguistics.
- “Findings of the AmericasNLP 2023 shared task on machine translation into indigenous languages,” in Proceedings of the Workshop on Natural Language Processing for Indigenous Languages of the Americas (AmericasNLP), Toronto, Canada, July 2023, pp. 206–219, Association for Computational Linguistics.
- “Findings of the AmericasNLP 2021 shared task on open machine translation for indigenous languages of the Americas,” in Proceedings of the First Workshop on Natural Language Processing for Indigenous Languages of the Americas, Online, June 2021, pp. 202–217, Association for Computational Linguistics.
- “Reducing barriers to self-supervised learning: Hubert pre-training with academic compute,” in Proc. Interspeech 2023, 2023.
- “XLS-R: Self-supervised cross-lingual speech representation learning at scale,” arXiv preprint arXiv:2111.09296, 2021.
- “Robust speech recognition via large-scale weak supervision,” arXiv preprint arXiv:2212.04356, 2022.
- “MLS: A Large-Scale Multilingual Dataset for Speech Research,” in Proc. Interspeech 2020, 2020, pp. 2757–2761.
- “Google USM: Scaling automatic speech recognition beyond 100 languages,” arXiv preprint arXiv:2303.01037, 2023.
- “FLEURS: Few-shot learning evaluation of universal representations of speech,” in 2022 IEEE Spoken Language Technology Workshop (SLT), 2023, pp. 798–805.
- “Exploration on HuBERT with multiple resolutions,” arXiv preprint arXiv:2306.01084, 2023.
- “Speech recognition and keyword spotting for low-resource languages: Babel project research at cued,” in Fourth International workshop on spoken language technologies for under-resourced languages (SLTU-2014). International Speech Communication Association (ISCA), 2014, pp. 16–23.
- “FINDINGS OF THE IWSLT 2023 EVALUATION CAMPAIGN,” in Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023), Toronto, Canada (in-person and online), July 2023, pp. 1–61, Association for Computational Linguistics.
- “Morphological disambiguation and text normalization for Southern Quechua varieties,” in Proceedings of the First Workshop on Applying NLP Tools to Similar Languages, Varieties and Dialects, Dublin, Ireland, Aug. 2014, pp. 39–47, Association for Computational Linguistics and Dublin City University.
- Annette Rios, “Applying finite-state techniques to a native american language: Quechua,” .
- “Machine learning disambiguation of Quechua verb morphology,” in Proceedings of the Second Workshop on Hybrid Approaches to Translation, Sofia, Bulgaria, Aug. 2013, pp. 13–18, Association for Computational Linguistics.
- Annette Rios, A basic language technology toolkit for quechua, Ph.D. thesis, University of Zurich, 2015.
- “A quechua-spanish parallel treebank,” Lot occasional series, vol. 12, pp. 53–64, 2008.
- “Spell checking an agglutinative language: Quechua,” 2011.
- “Using morphemes from agglutinative languages like Quechua and Finnish to aid in low-resource translation,” in Proceedings of the AMTA 2018 Workshop on Technologies for MT of Low Resource Languages (LoResMT 2018), Boston, MA, Mar. 2018, pp. 1–11, Association for Machine Translation in the Americas.
- “Neural machine translation with a polysynthetic low resource language,” Machine Translation, vol. 34, no. 4, pp. 325–346, 2020.
- “Morphologically-guided segmentation for translation of agglutinative low-resource languages,” in Proceedings of the 4th Workshop on Technologies for MT of Low Resource Languages (LoResMT2021), Virtual, Aug. 2021, pp. 20–31, Association for Machine Translation in the Americas.
- “Introducing QuBERT: A large monolingual corpus and BERT model for Southern Quechua,” in Proceedings of the Third Workshop on Deep Learning for Low-Resource Natural Language Processing, Hybrid, July 2022, pp. 1–13, Association for Computational Linguistics.
- “Siminchik: A speech corpus for preservation of southern quechua,” ISI-NLP 2, p. 21, 2018.
- “Huqariq: A multilingual speech corpus of native languages of Peru forSpeech recognition,” in Proceedings of the Thirteenth Language Resources and Evaluation Conference, Marseille, France, June 2022, pp. 5029–5034, European Language Resources Association.
- “Attention is all you need,” Proc. NeurIPS, vol. 30, 2017.
- “QUESPA submission for the IWSLT 2023 dialect and low-resource speech translation tasks,” in Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023), Toronto, Canada (in-person and online), July 2023, pp. 261–268, Association for Computational Linguistics.
- “Unsupervised cross-lingual representation learning for speech recognition,” arXiv preprint arXiv:2006.13979, 2020.
- Theodoros Giannakopoulos, “pyaudioanalysis: An open-source python library for audio signal analysis,” PloS one, vol. 10, no. 12, pp. e0144610, 2015.
- “ML-SUPERB: MultiLingual Speech Universal PERformance Benchmark,” in Proc. Interspeech 2023, 2023.
- “wav2vec: Unsupervised Pre-Training for Speech Recognition,” in Proc. Interspeech 2019, 2019, pp. 3465–3469.
- “wav2vec 2.0: A framework for self-supervised learning of speech representations,” in Advances in Neural Information Processing Systems. 2020, vol. 33, pp. 12449–12460, Curran Associates, Inc.
- “Textless speech-to-speech translation on real data,” in Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Seattle, United States, July 2022, pp. 860–872, Association for Computational Linguistics.
- “HuBERT: Self-supervised speech representation learning by masked prediction of hidden units,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 29, pp. 3451–3460, 2021.
- “ESPnet: End-to-end speech processing toolkit,” in Proceedings of Interspeech, 2018, pp. 2207–2211.
- “Specaugment: A simple data augmentation method for automatic speech recognition,” Proc. Interspeech, pp. 2613–2617, 2019.
- “Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks,” in Proc. International Conference on Machine Learning, 2006, pp. 369–376.
- “Adam: A method for stochastic optimization,” in Proc. ICLR, 2015.
- Chih-Chen Chen (2 papers)
- William Chen (49 papers)
- Rodolfo Zevallos (7 papers)
- John E. Ortega (13 papers)