Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ÌròyìnSpeech: A multi-purpose Yorùbá Speech Corpus (2307.16071v2)

Published 29 Jul 2023 in cs.CL, cs.SD, and eess.AS

Abstract: We introduce `{I}r`{o}y`{i}nSpeech, a new corpus influenced by the desire to increase the amount of high quality, contemporary Yor`{u}b\'{a} speech data, which can be used for both Text-to-Speech (TTS) and Automatic Speech Recognition (ASR) tasks. We curated about 23000 text sentences from news and creative writing domains with the open license CC-BY-4.0. To encourage a participatory approach to data creation, we provide 5000 curated sentences to the Mozilla Common Voice platform to crowd-source the recording and validation of Yor`{u}b\'{a} speech data. In total, we created about 42 hours of speech data recorded by 80 volunteers in-house, and 6 hours of validated recordings on Mozilla Common Voice platform. Our TTS evaluation suggests that a high-fidelity, general domain, single-speaker Yor`{u}b\'{a} voice is possible with as little as 5 hours of speech. Similarly, for ASR we obtained a baseline word error rate (WER) of 23.8.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Kola Tubosun (2 papers)
  2. Anuoluwapo Aremu (16 papers)
  3. Iroro Orife (20 papers)
  4. David Ifeoluwa Adelani (59 papers)
  5. Tolulope Ogunremi (5 papers)
Citations (3)

Summary

We haven't generated a summary for this paper yet.