Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Conversational SimulMT: Efficient Simultaneous Translation with Large Language Models (2402.10552v3)

Published 16 Feb 2024 in cs.CL

Abstract: Simultaneous machine translation (SimulMT) presents a challenging trade-off between translation quality and latency. Recent studies have shown that LLMs can achieve good performance in SimulMT tasks. However, this often comes at the expense of high inference cost and latency. In this paper, we propose a conversational SimulMT framework to enhance the inference efficiency of LLM-based SimulMT through multi-turn-dialogue-based decoding. Our experiments with Llama2-7b-chat on two SimulMT benchmarks demonstrate the superiority of LLM in translation quality while achieving comparable computational latency to specialized SimulMT models.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Minghan Wang (23 papers)
  2. Thuy-Trang Vu (23 papers)
  3. Ehsan Shareghi (54 papers)
  4. Gholamreza Haffari (141 papers)
  5. Yuxia Wang (41 papers)
Citations (1)