Attention-Guided Adaptation for Code-Switching Speech Recognition (2312.08856v2)

Published 14 Dec 2023 in eess.AS and cs.SD

Abstract: The prevalence of the powerful multilingual models, such as Whisper, has significantly advanced the researches on speech recognition. However, these models often struggle with handling the code-switching setting, which is essential in multilingual speech recognition. Recent studies have attempted to address this setting by separating the modules for different languages to ensure distinct latent representations for languages. Some other methods considered the switching mechanism based on language identification. In this study, a new attention-guided adaptation is proposed to conduct parameter-efficient learning for bilingual ASR. This method selects those attention heads in a model which closely express language identities and then guided those heads to be correctly attended with their corresponding languages. The experiments on the Mandarin-English code-switching speech corpus show that the proposed approach achieves a 14.2% mixed error rate, surpassing state-of-the-art method, where only 5.6% additional parameters over Whisper are trained.

References (26)

Authors (4)

Bobbi Aditya (1 paper)
Mahdin Rohmatillah (2 papers)
Liang-Hsuan Tai (2 papers)
Jen-Tzung Chien (6 papers)

Citations (7)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tweets

https://twitter.com/ArxivSound/status/1746721076426522883

Attention-Guided Adaptation for Code-Switching Speech Recognition (2312.08856v2)

Summary

Related Papers

Tweets