Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Self-Supervised Representations Improve End-to-End Speech Translation (2006.12124v2)

Published 22 Jun 2020 in eess.AS, cs.CL, and cs.SD

Abstract: End-to-end speech-to-text translation can provide a simpler and smaller system but is facing the challenge of data scarcity. Pre-training methods can leverage unlabeled data and have been shown to be effective on data-scarce settings. In this work, we explore whether self-supervised pre-trained speech representations can benefit the speech translation task in both high- and low-resource settings, whether they can transfer well to other languages, and whether they can be effectively combined with other common methods that help improve low-resource end-to-end speech translation such as using a pre-trained high-resource speech recognition system. We demonstrate that self-supervised pre-trained features can consistently improve the translation performance, and cross-lingual transfer allows to extend to a variety of languages without or with little tuning.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Anne Wu (11 papers)
  2. Changhan Wang (46 papers)
  3. Juan Pino (51 papers)
  4. Jiatao Gu (84 papers)
Citations (39)

Summary

We haven't generated a summary for this paper yet.