Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Document-Level Language Models for Machine Translation (2310.12303v1)

Published 18 Oct 2023 in cs.CL, cs.AI, and cs.LG

Abstract: Despite the known limitations, most machine translation systems today still operate on the sentence-level. One reason for this is, that most parallel training data is only sentence-level aligned, without document-level meta information available. In this work, we set out to build context-aware translation systems utilizing document-level monolingual data instead. This can be achieved by combining any existing sentence-level translation model with a document-level LLM. We improve existing approaches by leveraging recent advancements in model combination. Additionally, we propose novel weighting techniques that make the system combination more flexible and significantly reduce computational overhead. In a comprehensive evaluation on four diverse translation tasks, we show that our extensions improve document-targeted scores substantially and are also computationally more efficient. However, we also find that in most scenarios, back-translation gives even better results, at the cost of having to re-train the translation system. Finally, we explore LLM fusion in the light of recent advancements in LLMs. Our findings suggest that there might be strong potential in utilizing LLMs via model combination.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Frithjof Petrick (1 paper)
  2. Christian Herold (20 papers)
  3. Pavel Petrushkov (9 papers)
  4. Shahram Khadivi (29 papers)
  5. Hermann Ney (104 papers)
Citations (8)