Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Consistent Training and Decoding For End-to-end Speech Recognition Using Lattice-free MMI (2112.02498v2)

Published 5 Dec 2021 in cs.AI and cs.CL

Abstract: Recently, End-to-End (E2E) frameworks have achieved remarkable results on various Automatic Speech Recognition (ASR) tasks. However, Lattice-Free Maximum Mutual Information (LF-MMI), as one of the discriminative training criteria that show superior performance in hybrid ASR systems, is rarely adopted in E2E ASR frameworks. In this work, we propose a novel approach to integrate LF-MMI criterion into E2E ASR frameworks in both training and decoding stages. The proposed approach shows its effectiveness on two of the most widely used E2E frameworks including Attention-Based Encoder-Decoders (AEDs) and Neural Transducers (NTs). Experiments suggest that the introduction of the LF-MMI criterion consistently leads to significant performance improvements on various datasets and different E2E ASR frameworks. The best of our models achieves competitive CER of 4.1\% / 4.4\% on Aishell-1 dev/test set; we also achieve significant error reduction on Aishell-2 and Librispeech datasets over strong baselines.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Jinchuan Tian (33 papers)
  2. Jianwei Yu (64 papers)
  3. Chao Weng (61 papers)
  4. Shi-Xiong Zhang (48 papers)
  5. Dan Su (101 papers)
  6. Dong Yu (329 papers)
  7. Yuexian Zou (119 papers)
Citations (12)

Summary

We haven't generated a summary for this paper yet.