Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Minimum Word Error Rate Training with Language Model Fusion for End-to-End Speech Recognition (2106.02302v1)

Published 4 Jun 2021 in eess.AS, cs.AI, cs.CL, cs.LG, and cs.SD

Abstract: Integrating external LLMs (LMs) into end-to-end (E2E) models remains a challenging task for domain-adaptive speech recognition. Recently, internal LLM estimation (ILME)-based LM fusion has shown significant word error rate (WER) reduction from Shallow Fusion by subtracting a weighted internal LM score from an interpolation of E2E model and external LM scores during beam search. However, on different test sets, the optimal LM interpolation weights vary over a wide range and have to be tuned extensively on well-matched validation sets. In this work, we perform LM fusion in the minimum WER (MWER) training of an E2E model to obviate the need for LM weights tuning during inference. Besides MWER training with Shallow Fusion (MWER-SF), we propose a novel MWER training with ILME (MWER-ILME) where the ILME-based fusion is conducted to generate N-best hypotheses and their posteriors. Additional gradient is induced when internal LM is engaged in MWER-ILME loss computation. During inference, LM weights pre-determined in MWER training enable robust LM integrations on test sets from different domains. Experimented with 30K-hour trained transformer transducers, MWER-ILME achieves on average 8.8% and 5.8% relative WER reductions from MWER and MWER-SF training, respectively, on 6 different test sets

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Zhong Meng (53 papers)
  2. Yu Wu (196 papers)
  3. Naoyuki Kanda (61 papers)
  4. Liang Lu (42 papers)
  5. Xie Chen (165 papers)
  6. Guoli Ye (15 papers)
  7. Eric Sun (14 papers)
  8. Jinyu Li (164 papers)
  9. Yifan Gong (82 papers)
Citations (21)