Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Improving N-gram Language Models with Pre-trained Deep Transformer (1911.10235v1)

Published 22 Nov 2019 in cs.CL and cs.LG

Abstract: Although n-gram LLMs (LMs) have been outperformed by the state-of-the-art neural LMs, they are still widely used in speech recognition due to its high efficiency in inference. In this paper, we demonstrate that n-gram LM can be improved by neural LMs through a text generation based data augmentation method. In contrast to previous approaches, we employ a large-scale general domain pre-training followed by in-domain fine-tuning strategy to construct deep Transformer based neural LMs. Large amount of in-domain text data is generated with the well trained deep Transformer to construct new n-gram LMs, which are then interpolated with baseline n-gram systems. Empirical studies on different speech recognition tasks show that the proposed approach can effectively improve recognition accuracy. In particular, our proposed approach brings significant relative word error rate reduction up to 6.0% for domains with limited in-domain data.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Yiren Wang (15 papers)
  2. Hongzhao Huang (4 papers)
  3. Zhe Liu (234 papers)
  4. Yutong Pang (7 papers)
  5. Yongqiang Wang (92 papers)
  6. ChengXiang Zhai (64 papers)
  7. Fuchun Peng (18 papers)
Citations (8)