Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Scalable Multi Corpora Neural Language Models for ASR (1907.01677v1)

Published 2 Jul 2019 in cs.CL and cs.LG

Abstract: Neural LLMs (NLM) have been shown to outperform conventional n-gram LLMs by a substantial margin in Automatic Speech Recognition (ASR) and other tasks. There are, however, a number of challenges that need to be addressed for an NLM to be used in a practical large-scale ASR system. In this paper, we present solutions to some of the challenges, including training NLM from heterogenous corpora, limiting latency impact and handling personalized bias in the second-pass rescorer. Overall, we show that we can achieve a 6.2% relative WER reduction using neural LM in a second-pass n-best rescoring framework with a minimal increase in latency.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Anirudh Raju (20 papers)
  2. Denis Filimonov (12 papers)
  3. Gautam Tiwari (7 papers)
  4. Guitang Lan (7 papers)
  5. Ariya Rastrow (55 papers)
Citations (26)