Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Re-Translation Strategies For Long Form, Simultaneous, Spoken Language Translation (1912.03393v2)

Published 6 Dec 2019 in cs.CL, cs.AI, and cs.LG

Abstract: We investigate the problem of simultaneous machine translation of long-form speech content. We target a continuous speech-to-text scenario, generating translated captions for a live audio feed, such as a lecture or play-by-play commentary. As this scenario allows for revisions to our incremental translations, we adopt a re-translation approach to simultaneous translation, where the source is repeatedly translated from scratch as it grows. This approach naturally exhibits very low latency and high final quality, but at the cost of incremental instability as the output is continuously refined. We experiment with a pipeline of industry-grade speech recognition and translation tools, augmented with simple inference heuristics to improve stability. We use TED Talks as a source of multilingual test data, developing our techniques on English-to-German spoken language translation. Our minimalist approach to simultaneous translation allows us to easily scale our final evaluation to six more target languages, dramatically improving incremental stability for all of them.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Naveen Arivazhagan (15 papers)
  2. Colin Cherry (38 papers)
  3. Te I (3 papers)
  4. Wolfgang Macherey (23 papers)
  5. Pallavi Baljekar (3 papers)
  6. George Foster (24 papers)
Citations (51)