WCTC-Biasing: Retraining-free Contextual Biasing ASR with Wildcard CTC-based Keyword Spotting and Inter-layer Biasing (2506.01263v1)

Published 2 Jun 2025 in cs.CL, cs.SD, and eess.AS

Abstract: Despite recent advances in end-to-end speech recognition methods, the output tends to be biased to the training data's vocabulary, resulting in inaccurate recognition of proper nouns and other unknown terms. To address this issue, we propose a method to improve recognition accuracy of such rare words in CTC-based models without additional training or text-to-speech systems. Specifically, keyword spotting is performed using acoustic features of intermediate layers during inference, and a bias is applied to the subsequent layers of the acoustic model for detected keywords. For keyword detection, we adopt a wildcard CTC that is both fast and tolerant of ambiguous matches, allowing flexible handling of words that are difficult to match strictly. Since this method does not require retraining of existing models, it can be easily applied to even large-scale models. In experiments on Japanese speech recognition, the proposed method achieved a 29% improvement in the F1 score for unknown words.

Summary

We haven't generated a summary for this paper yet.

Summarize Now

WCTC-Biasing: Retraining-free Contextual Biasing ASR with Wildcard CTC-based Keyword Spotting and Inter-layer Biasing (2506.01263v1)

Summary

Related Papers