Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Multilingual Approach to Joint Speech and Accent Recognition with DNN-HMM Framework (2010.11483v2)

Published 22 Oct 2020 in eess.AS and cs.SD

Abstract: Human can recognize speech, as well as the peculiar accent of the speech simultaneously. However, present state-of-the-art ASR system can rarely do that. In this paper, we propose a multilingual approach to recognizing English speech, and related accent that speaker conveys using DNN-HMM framework. Specifically, we assume different accents of English as different languages. We then merge them together and train a multilingual ASR system. During decoding, we conduct two experiments. One is a monolingual ASR-based decoding, with the accent information embedded at phone level, realizing word-based accent recognition (AR), and the other is a multilingual ASR-based decoding, realizing an approximated utterance-based AR. Experimental results on an 8-accent English speech recognition show both methods can yield WERs close to the conventional ASR systems that completely ignore the accent, as well as desired AR accuracy. Besides, we conduct extensive analysis for the proposed method, such as transfer learning without-domain data exploitation, cross-accent recognition confusion, as well as characteristics of accented-word.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Yizhou Peng (14 papers)
  2. Jicheng Zhang (30 papers)
  3. Haobo Zhang (31 papers)
  4. Haihua Xu (23 papers)
  5. Hao Huang (155 papers)
  6. Eng Siong Chng (112 papers)
Citations (2)