Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

End-to-End Speech Translation with Knowledge Distillation (1904.08075v1)

Published 17 Apr 2019 in cs.CL

Abstract: End-to-end speech translation (ST), which directly translates from source language speech into target language text, has attracted intensive attentions in recent years. Compared to conventional pipeline systems, end-to-end ST models have advantages of lower latency, smaller model size and less error propagation. However, the combination of speech recognition and text translation in one model is more difficult than each of these two tasks. In this paper, we propose a knowledge distillation approach to improve ST model by transferring the knowledge from text translation model. Specifically, we first train a text translation model, regarded as a teacher model, and then ST model is trained to learn output probabilities from teacher model through knowledge distillation. Experiments on English- French Augmented LibriSpeech and English-Chinese TED corpus show that end-to-end ST is possible to implement on both similar and dissimilar language pairs. In addition, with the instruction of teacher model, end-to-end ST model can gain significant improvements by over 3.5 BLEU points.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Yuchen Liu (156 papers)
  2. Hao Xiong (41 papers)
  3. Zhongjun He (19 papers)
  4. Jiajun Zhang (176 papers)
  5. Hua Wu (191 papers)
  6. Haifeng Wang (194 papers)
  7. Chengqing Zong (65 papers)
Citations (150)