2000 character limit reached
Scalable Multi Corpora Neural Language Models for ASR (1907.01677v1)
Published 2 Jul 2019 in cs.CL and cs.LG
Abstract: Neural LLMs (NLM) have been shown to outperform conventional n-gram LLMs by a substantial margin in Automatic Speech Recognition (ASR) and other tasks. There are, however, a number of challenges that need to be addressed for an NLM to be used in a practical large-scale ASR system. In this paper, we present solutions to some of the challenges, including training NLM from heterogenous corpora, limiting latency impact and handling personalized bias in the second-pass rescorer. Overall, we show that we can achieve a 6.2% relative WER reduction using neural LM in a second-pass n-best rescoring framework with a minimal increase in latency.
- Anirudh Raju (20 papers)
- Denis Filimonov (12 papers)
- Gautam Tiwari (7 papers)
- Guitang Lan (7 papers)
- Ariya Rastrow (55 papers)