Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Phoneme Level Language Models for Sequence Based Low Resource ASR (1902.07613v1)

Published 20 Feb 2019 in cs.CL

Abstract: Building multilingual and crosslingual models help bring different languages together in a language universal space. It allows models to share parameters and transfer knowledge across languages, enabling faster and better adaptation to a new language. These approaches are particularly useful for low resource languages. In this paper, we propose a phoneme-level LLM that can be used multilingually and for crosslingual adaptation to a target language. We show that our model performs almost as well as the monolingual models by using six times fewer parameters, and is capable of better adaptation to languages not seen during training in a low resource scenario. We show that these phoneme-level LLMs can be used to decode sequence based Connectionist Temporal Classification (CTC) acoustic model outputs to obtain comparable word error rates with Weighted Finite State Transducer (WFST) based decoding in Babel languages. We also show that these phoneme-level LLMs outperform WFST decoding in various low-resource conditions like adapting to a new language and domain mismatch between training and testing data.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Siddharth Dalmia (36 papers)
  2. Xinjian Li (26 papers)
  3. Alan W Black (83 papers)
  4. Florian Metze (80 papers)
Citations (7)