Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Why Neural Machine Translation Prefers Empty Outputs (2012.13454v1)

Published 24 Dec 2020 in cs.CL

Abstract: We investigate why neural machine translation (NMT) systems assign high probability to empty translations. We find two explanations. First, label smoothing makes correct-length translations less confident, making it easier for the empty translation to finally outscore them. Second, NMT systems use the same, high-frequency EoS word to end all target sentences, regardless of length. This creates an implicit smoothing that increases zero-length translations. Using different EoS types in target sentences of different lengths exposes and eliminates this implicit smoothing.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Xing Shi (20 papers)
  2. Yijun Xiao (10 papers)
  3. Kevin Knight (29 papers)
Citations (9)

Summary

We haven't generated a summary for this paper yet.