L2RS: A Learning-to-Rescore Mechanism for Automatic Speech Recognition (1910.11496v1)

Published 25 Oct 2019 in cs.CL, cs.SD, and eess.AS

Abstract: Modern Automatic Speech Recognition (ASR) systems primarily rely on scores from an Acoustic Model (AM) and a LLM (LM) to rescore the N-best lists. With the abundance of recent natural language processing advances, the information utilized by current ASR for evaluating the linguistic and semantic legitimacy of the N-best hypotheses is rather limited. In this paper, we propose a novel Learning-to-Rescore (L2RS) mechanism, which is specialized for utilizing a wide range of textual information from the state-of-the-art NLP models and automatically deciding their weights to rescore the N-best lists for ASR systems. Specifically, we incorporate features including BERT sentence embedding, topic vector, and perplexity scores produced by n-gram LM, topic modeling LM, BERT LM and RNNLM to train a rescoring model. We conduct extensive experiments based on a public dataset, and experimental results show that L2RS outperforms not only traditional rescoring methods but also its deep neural network counterparts by a substantial improvement of 20.67% in terms of NDCG@10. L2RS paves the way for developing more effective rescoring models for ASR.

PDF Abstract

Summarize Bookmark Chat (Pro)

Authors (7)

Yuanfeng Song (27 papers)
Di Jiang (42 papers)
Xuefang Zhao (4 papers)
Qian Xu (55 papers)
Raymond Chi-Wing Wong (29 papers)
Lixin Fan (77 papers)
Qiang Yang (202 papers)

Citations (17)

View on Semantic Scholar

L2RS: A Learning-to-Rescore Mechanism for Automatic Speech Recognition (1910.11496v1)

Related Papers