Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Full-Sum Decoding for Hybrid HMM based Speech Recognition using LSTM Language Model (2004.00967v1)

Published 2 Apr 2020 in eess.AS and cs.SD

Abstract: In hybrid HMM based speech recognition, LSTM LLMs have been widely applied and achieved large improvements. The theoretical capability of modeling any unlimited context suggests that no recombination should be applied in decoding. This motivates to reconsider full summation over the HMM-state sequences instead of Viterbi approximation in decoding. We explore the potential gain from more accurate probabilities in terms of decision making and apply the full-sum decoding with a modified prefix-tree search framework. The proposed full-sum decoding is evaluated on both Switchboard and Librispeech corpora. Different models using CE and sMBR training criteria are used. Additionally, both MAP and confusion network decoding as approximated variants of general Bayes decision rule are evaluated. Consistent improvements over strong baselines are achieved in almost all cases without extra cost. We also discuss tuning effort, efficiency and some limitations of full-sum decoding.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Wei Zhou (311 papers)
  2. Ralf Schlüter (73 papers)
  3. Hermann Ney (104 papers)
Citations (9)

Summary

We haven't generated a summary for this paper yet.