Dual-Pipeline with Low-Rank Adaptation for New Language Integration in Multilingual ASR (2406.07842v1)

Published 12 Jun 2024 in eess.AS and cs.CL

Abstract: This paper addresses challenges in integrating new languages into a pre-trained multilingual automatic speech recognition (mASR) system, particularly in scenarios where training data for existing languages is limited or unavailable. The proposed method employs a dual-pipeline with low-rank adaptation (LoRA). It maintains two data flow pipelines-one for existing languages and another for new languages. The primary pipeline follows the standard flow through the pre-trained parameters of mASR, while the secondary pipeline additionally utilizes language-specific parameters represented by LoRA and a separate output decoder module. Importantly, the proposed approach minimizes the performance degradation of existing languages and enables a language-agnostic operation mode, facilitated by a decoder selection strategy. We validate the effectiveness of the proposed method by extending the pre-trained Whisper model to 19 new languages from the FLEURS dataset

Authors (8)

Yerbolat Khassanov (19 papers)
Zhipeng Chen (46 papers)
Tianfeng Chen (3 papers)
Tze Yuang Chong (3 papers)
Wei Li (1122 papers)
Jun Zhang (1008 papers)
Lu Lu (189 papers)
Yuxuan Wang (239 papers)

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tweets

https://twitter.com/mctalentowen/status/1801187117747634392

Dual-Pipeline with Low-Rank Adaptation for New Language Integration in Multilingual ASR (2406.07842v1)

Summary

Related Papers

Tweets