Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Improving Simultaneous Translation by Incorporating Pseudo-References with Fewer Reorderings (2010.11247v2)

Published 21 Oct 2020 in cs.CL

Abstract: Simultaneous translation is vastly different from full-sentence translation, in the sense that it starts translation before the source sentence ends, with only a few words delay. However, due to the lack of large-scale, high-quality simultaneous translation datasets, most such systems are still trained on conventional full-sentence bitexts. This is far from ideal for the simultaneous scenario due to the abundance of unnecessary long-distance reorderings in those bitexts. We propose a novel method that rewrites the target side of existing full-sentence corpora into simultaneous-style translation. Experiments on Zh->En and Ja->En simultaneous translation show substantial improvements (up to +2.7 BLEU) with the addition of these generated pseudo-references.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Junkun Chen (27 papers)
  2. Renjie Zheng (29 papers)
  3. Atsuhito Kita (1 paper)
  4. Mingbo Ma (32 papers)
  5. Liang Huang (108 papers)
Citations (22)

Summary

We haven't generated a summary for this paper yet.