Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

L2RS: A Learning-to-Rescore Mechanism for Automatic Speech Recognition (1910.11496v1)

Published 25 Oct 2019 in cs.CL, cs.SD, and eess.AS

Abstract: Modern Automatic Speech Recognition (ASR) systems primarily rely on scores from an Acoustic Model (AM) and a LLM (LM) to rescore the N-best lists. With the abundance of recent natural language processing advances, the information utilized by current ASR for evaluating the linguistic and semantic legitimacy of the N-best hypotheses is rather limited. In this paper, we propose a novel Learning-to-Rescore (L2RS) mechanism, which is specialized for utilizing a wide range of textual information from the state-of-the-art NLP models and automatically deciding their weights to rescore the N-best lists for ASR systems. Specifically, we incorporate features including BERT sentence embedding, topic vector, and perplexity scores produced by n-gram LM, topic modeling LM, BERT LM and RNNLM to train a rescoring model. We conduct extensive experiments based on a public dataset, and experimental results show that L2RS outperforms not only traditional rescoring methods but also its deep neural network counterparts by a substantial improvement of 20.67% in terms of NDCG@10. L2RS paves the way for developing more effective rescoring models for ASR.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Yuanfeng Song (27 papers)
  2. Di Jiang (42 papers)
  3. Xuefang Zhao (4 papers)
  4. Qian Xu (55 papers)
  5. Raymond Chi-Wing Wong (29 papers)
  6. Lixin Fan (77 papers)
  7. Qiang Yang (202 papers)
Citations (17)