Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Rethinking Document-level Neural Machine Translation (2010.08961v2)

Published 18 Oct 2020 in cs.CL

Abstract: This paper does not aim at introducing a novel model for document-level neural machine translation. Instead, we head back to the original Transformer model and hope to answer the following question: Is the capacity of current models strong enough for document-level translation? Interestingly, we observe that the original Transformer with appropriate training techniques can achieve strong results for document translation, even with a length of 2000 words. We evaluate this model and several recent approaches on nine document-level datasets and two sentence-level datasets across six languages. Experiments show that document-level Transformer models outperforms sentence-level ones and many previous methods in a comprehensive set of metrics, including BLEU, four lexical indices, three newly proposed assistant linguistic indicators, and human evaluation.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Zewei Sun (15 papers)
  2. Mingxuan Wang (83 papers)
  3. Hao Zhou (351 papers)
  4. Chengqi Zhao (15 papers)
  5. Shujian Huang (106 papers)
  6. Jiajun Chen (125 papers)
  7. Lei Li (1293 papers)
Citations (42)