Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Reducing Exposure Bias in Training Recurrent Neural Network Transducers (2108.10803v1)

Published 24 Aug 2021 in cs.CL, cs.AI, cs.SD, and eess.AS

Abstract: When recurrent neural network transducers (RNNTs) are trained using the typical maximum likelihood criterion, the prediction network is trained only on ground truth label sequences. This leads to a mismatch during inference, known as exposure bias, when the model must deal with label sequences containing errors. In this paper we investigate approaches to reducing exposure bias in training to improve the generalization of RNNT models for automatic speech recognition (ASR). A label-preserving input perturbation to the prediction network is introduced. The input token sequences are perturbed using SwitchOut and scheduled sampling based on an additional token LLM. Experiments conducted on the 300-hour Switchboard dataset demonstrate their effectiveness. By reducing the exposure bias, we show that we can further improve the accuracy of a high-performance RNNT ASR model and obtain state-of-the-art results on the 300-hour Switchboard dataset.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Xiaodong Cui (55 papers)
  2. Brian Kingsbury (54 papers)
  3. George Saon (39 papers)
  4. David Haws (16 papers)
  5. Zoltan Tuske (14 papers)
Citations (5)

Summary

We haven't generated a summary for this paper yet.