Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

On the End-to-End Solution to Mandarin-English Code-switching Speech Recognition (1811.00241v2)

Published 1 Nov 2018 in cs.CL

Abstract: Code-switching (CS) refers to a linguistic phenomenon where a speaker uses different languages in an utterance or between alternating utterances. In this work, we study end-to-end (E2E) approaches to the Mandarin-English code-switching speech recognition (CSSR) task. We first examine the effectiveness of using data augmentation and byte-pair encoding (BPE) subword units. More importantly, we propose a multitask learning recipe, where a language identification task is explicitly learned in addition to the E2E speech recognition task. Furthermore, we introduce an efficient word vocabulary expansion method for LLMing to alleviate data sparsity issues under the code-switching scenario. Experimental results on the SEAME data, a Mandarin-English CS corpus, demonstrate the effectiveness of the proposed methods.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Zhiping Zeng (6 papers)
  2. Yerbolat Khassanov (19 papers)
  3. Van Tung Pham (13 papers)
  4. Haihua Xu (23 papers)
  5. Eng Siong Chng (112 papers)
  6. Haizhou Li (286 papers)
Citations (91)