Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Context-based out-of-vocabulary word recovery for ASR systems in Indian languages (2206.04305v1)

Published 9 Jun 2022 in eess.AS, cs.CL, and cs.SD

Abstract: Detecting and recovering out-of-vocabulary (OOV) words is always challenging for Automatic Speech Recognition (ASR) systems. Many existing methods focus on modeling OOV words by modifying acoustic and LLMs and integrating context words cleverly into models. To train such complex models, we need a large amount of data with context words, additional training time, and increased model size. However, after getting the ASR transcription to recover context-based OOV words, the post-processing method has not been explored much. In this work, we propose a post-processing technique to improve the performance of context-based OOV recovery. We created an acoustically boosted LLM with a sub-graph made at phone level with an OOV words list. We proposed two methods to determine a suitable cost function to retrieve the OOV words based on the context. The cost function is defined based on phonetic and acoustic knowledge for matching and recovering the correct context words in the decode. The effectiveness of the proposed cost function is evaluated at both word-level and sentence-level. The evaluation results show that this approach can recover an average of 50% context-based OOV words across multiple categories.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Arun Baby (3 papers)
  2. Saranya Vinnaitherthan (3 papers)
  3. Akhil Kerhalkar (1 paper)
  4. Pranav Jawale (3 papers)
  5. Sharath Adavanne (23 papers)
  6. Nagaraj Adiga (6 papers)

Summary

We haven't generated a summary for this paper yet.