Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Neural Simultaneous Speech Translation Using Alignment-Based Chunking (2005.14489v1)

Published 29 May 2020 in cs.CL

Abstract: In simultaneous machine translation, the objective is to determine when to produce a partial translation given a continuous stream of source words, with a trade-off between latency and quality. We propose a neural machine translation (NMT) model that makes dynamic decisions when to continue feeding on input or generate output words. The model is composed of two main components: one to dynamically decide on ending a source chunk, and another that translates the consumed chunk. We train the components jointly and in a manner consistent with the inference conditions. To generate chunked training data, we propose a method that utilizes word alignment while also preserving enough context. We compare models with bidirectional and unidirectional encoders of different depths, both on real speech and text input. Our results on the IWSLT 2020 English-to-German task outperform a wait-k baseline by 2.6 to 3.7% BLEU absolute.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Patrick Wilken (6 papers)
  2. Tamer Alkhouli (7 papers)
  3. Evgeny Matusov (11 papers)
  4. Pavel Golik (5 papers)
Citations (14)

Summary

We haven't generated a summary for this paper yet.