Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Exploring Lexicon-Free Modeling Units for End-to-End Korean and Korean-English Code-Switching Speech Recognition (1910.11590v1)

Published 25 Oct 2019 in cs.SD and eess.AS

Abstract: As the character-based end-to-end automatic speech recognition (ASR) models evolve, the choice of acoustic modeling units becomes important. Since Korean is a fairly phonetic language and has a unique writing system with its own Korean alphabet, it's worth investigating modeling units for an end-to-end Korean ASR task. In this work, we introduce lexicon-free modeling units in Korean, and explore them using a hybrid CTC/Attention-based encoder-decoder model. Five lexicon-free units are investigated: Syllable-based Korean character (with English character for a code-switching task), Korean Jamo character (with English character), sub-word on syllable-based character (with sub-word in English), sub-word on Jamo character (with sub-words in English), and finally byte unit, which is a universal one across language. Experiments on Zeroth-Korean (51.6 hrs) and Medical Record (2530 hrs) are done for Korean and Korean-English code-switching ASR tasks, respectively. Sequence-to-sequence learning with sub-words based on Korean syllables (and sub-words in English) performs the best for both tasks without lexicon and an extra LLM integration.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Jisung Wang (3 papers)
  2. Jihwan Kim (25 papers)
  3. Sangki Kim (1 paper)
  4. Yeha Lee (3 papers)
Citations (5)