Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Improved Self-Supervised Multilingual Speech Representation Learning Combined with Auxiliary Language Information (2212.03476v1)

Published 7 Dec 2022 in eess.AS, cs.CL, and cs.SD

Abstract: Multilingual end-to-end models have shown great improvement over monolingual systems. With the development of pre-training methods on speech, self-supervised multilingual speech representation learning like XLSR has shown success in improving the performance of multilingual automatic speech recognition (ASR). However, similar to the supervised learning, multilingual pre-training may also suffer from language interference and further affect the application of multilingual system. In this paper, we introduce several techniques for improving self-supervised multilingual pre-training by leveraging auxiliary language information, including the language adversarial training, language embedding and language adaptive training during the pre-training stage. We conduct experiments on a multilingual ASR task consisting of 16 languages. Our experimental results demonstrate 14.3% relative gain over the standard XLSR model, and 19.8% relative gain over the no pre-training multilingual model.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Fenglin Ding (5 papers)
  2. Genshun Wan (10 papers)
  3. Pengcheng Li (60 papers)
  4. Jia Pan (127 papers)
  5. Cong Liu (169 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.