Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Joint Modeling of Accents and Acoustics for Multi-Accent Speech Recognition (1802.02656v1)

Published 7 Feb 2018 in cs.CL, cs.SD, and eess.AS

Abstract: The performance of automatic speech recognition systems degrades with increasing mismatch between the training and testing scenarios. Differences in speaker accents are a significant source of such mismatch. The traditional approach to deal with multiple accents involves pooling data from several accents during training and building a single model in multi-task fashion, where tasks correspond to individual accents. In this paper, we explore an alternate model where we jointly learn an accent classifier and a multi-task acoustic model. Experiments on the American English Wall Street Journal and British English Cambridge corpora demonstrate that our joint model outperforms the strong multi-task acoustic model baseline. We obtain a 5.94% relative improvement in word error rate on British English, and 9.47% relative improvement on American English. This illustrates that jointly modeling with accent information improves acoustic model performance.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Xuesong Yang (18 papers)
  2. Kartik Audhkhasi (22 papers)
  3. Andrew Rosenberg (32 papers)
  4. Samuel Thomas (42 papers)
  5. Bhuvana Ramabhadran (47 papers)
  6. Mark Hasegawa-Johnson (62 papers)
Citations (69)

Summary

We haven't generated a summary for this paper yet.