Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Svarah: Evaluating English ASR Systems on Indian Accents (2305.15760v1)

Published 25 May 2023 in cs.CL, cs.SD, and eess.AS

Abstract: India is the second largest English-speaking country in the world with a speaker base of roughly 130 million. Thus, it is imperative that automatic speech recognition (ASR) systems for English should be evaluated on Indian accents. Unfortunately, Indian speakers find a very poor representation in existing English ASR benchmarks such as LibriSpeech, Switchboard, Speech Accent Archive, etc. In this work, we address this gap by creating Svarah, a benchmark that contains 9.6 hours of transcribed English audio from 117 speakers across 65 geographic locations throughout India, resulting in a diverse range of accents. Svarah comprises both read speech and spontaneous conversational data, covering various domains, such as history, culture, tourism, etc., ensuring a diverse vocabulary. We evaluate 6 open source ASR models and 2 commercial ASR systems on Svarah and show that there is clear scope for improvement on Indian accents. Svarah as well as all our code will be publicly available.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Tahir Javed (9 papers)
  2. Sakshi Joshi (4 papers)
  3. Vignesh Nagarajan (2 papers)
  4. Sai Sundaresan (3 papers)
  5. Janki Nawale (3 papers)
  6. Abhigyan Raman (5 papers)
  7. Kaushal Bhogale (6 papers)
  8. Pratyush Kumar (44 papers)
  9. Mitesh M. Khapra (79 papers)
Citations (7)