Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

FST: the FAIR Speech Translation System for the IWSLT21 Multilingual Shared Task (2107.06959v2)

Published 14 Jul 2021 in cs.CL, cs.SD, and eess.AS

Abstract: In this paper, we describe our end-to-end multilingual speech translation system submitted to the IWSLT 2021 evaluation campaign on the Multilingual Speech Translation shared task. Our system is built by leveraging transfer learning across modalities, tasks and languages. First, we leverage general-purpose multilingual modules pretrained with large amounts of unlabelled and labelled data. We further enable knowledge transfer from the text task to the speech task by training two tasks jointly. Finally, our multilingual model is finetuned on speech translation task-specific data to achieve the best translation results. Experimental results show our system outperforms the reported systems, including both end-to-end and cascaded based approaches, by a large margin. In some translation directions, our speech translation results evaluated on the public Multilingual TEDx test set are even comparable with the ones from a strong text-to-text translation system, which uses the oracle speech transcripts as input.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Yun Tang (42 papers)
  2. Hongyu Gong (44 papers)
  3. Xian Li (116 papers)
  4. Changhan Wang (46 papers)
  5. Juan Pino (51 papers)
  6. Holger Schwenk (35 papers)
  7. Naman Goyal (37 papers)
Citations (10)

Summary

We haven't generated a summary for this paper yet.