Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

TransLLaMa: LLM-based Simultaneous Translation System (2402.04636v1)

Published 7 Feb 2024 in cs.CL

Abstract: Decoder-only LLMs have recently demonstrated impressive capabilities in text generation and reasoning. Nonetheless, they have limited applications in simultaneous machine translation (SiMT), currently dominated by encoder-decoder transformers. This study demonstrates that, after fine-tuning on a small dataset comprising causally aligned source and target sentence pairs, a pre-trained open-source LLM can control input segmentation directly by generating a special "wait" token. This obviates the need for a separate policy and enables the LLM to perform English-German and English-Russian SiMT tasks with BLEU scores that are comparable to those of specific state-of-the-art baselines. We also evaluated closed-source models such as GPT-4, which displayed encouraging results in performing the SiMT task without prior training (zero-shot), indicating a promising avenue for enhancing future SiMT systems.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Roman Koshkin (3 papers)
  2. Katsuhito Sudoh (35 papers)
  3. Satoshi Nakamura (94 papers)
Citations (12)
X Twitter Logo Streamline Icon: https://streamlinehq.com