Scalable Multi Corpora Neural Language Models for ASR (1907.01677v1)

Published 2 Jul 2019 in cs.CL and cs.LG

Abstract: Neural LLMs (NLM) have been shown to outperform conventional n-gram LLMs by a substantial margin in Automatic Speech Recognition (ASR) and other tasks. There are, however, a number of challenges that need to be addressed for an NLM to be used in a practical large-scale ASR system. In this paper, we present solutions to some of the challenges, including training NLM from heterogenous corpora, limiting latency impact and handling personalized bias in the second-pass rescorer. Overall, we show that we can achieve a 6.2% relative WER reduction using neural LM in a second-pass n-best rescoring framework with a minimal increase in latency.

PDF Abstract

Summarize Bookmark Chat (Pro)

Authors (5)

Anirudh Raju (20 papers)
Denis Filimonov (12 papers)
Gautam Tiwari (7 papers)
Guitang Lan (7 papers)
Ariya Rastrow (55 papers)

Citations (26)

View on Semantic Scholar

Scalable Multi Corpora Neural Language Models for ASR (1907.01677v1)

Related Papers