Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Segmentation-Free Streaming Machine Translation (2309.14823v2)

Published 26 Sep 2023 in cs.CL

Abstract: Streaming Machine Translation (MT) is the task of translating an unbounded input text stream in real-time. The traditional cascade approach, which combines an Automatic Speech Recognition (ASR) and an MT system, relies on an intermediate segmentation step which splits the transcription stream into sentence-like units. However, the incorporation of a hard segmentation constrains the MT system and is a source of errors. This paper proposes a Segmentation-Free framework that enables the model to translate an unsegmented source stream by delaying the segmentation decision until the translation has been generated. Extensive experiments show how the proposed Segmentation-Free framework has better quality-latency trade-off than competing approaches that use an independent segmentation model. Software, data and models will be released upon paper acceptance.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Javier Iranzo-Sánchez (6 papers)
  2. Jorge Iranzo-Sánchez (2 papers)
  3. Adrià Giménez (3 papers)
  4. Jorge Civera (6 papers)
  5. Alfons Juan (7 papers)

Summary

We haven't generated a summary for this paper yet.