Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Internal Language Model Estimation based Language Model Fusion for Cross-Domain Code-Switching Speech Recognition (2207.04176v1)

Published 9 Jul 2022 in eess.AS, cs.CL, and cs.SD

Abstract: Internal LLM Estimation (ILME) based LLM (LM) fusion has been shown significantly improved recognition results over conventional shallow fusion in both intra-domain and cross-domain speech recognition tasks. In this paper, we attempt to apply our ILME method to cross-domain code-switching speech recognition (CSSR) work. Specifically, our curiosity comes from several aspects. First, we are curious about how effective the ILME-based LM fusion is for both intra-domain and cross-domain CSSR tasks. We verify this with or without merging two code-switching domains. More importantly, we train an end-to-end (E2E) speech recognition model by means of merging two monolingual data sets and observe the efficacy of the proposed ILME-based LM fusion for CSSR. Experimental results on SEAME that is from Southeast Asian and another Chinese Mainland CS data set demonstrate the effectiveness of the proposed ILME-based LM fusion method.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Yizhou Peng (12 papers)
  2. Yufei Liu (23 papers)
  3. Jicheng Zhang (30 papers)
  4. Haihua Xu (23 papers)
  5. Yi He (78 papers)
  6. Hao Huang (153 papers)
  7. Eng Siong Chng (112 papers)
Citations (9)