Why Neural Machine Translation Prefers Empty Outputs (2012.13454v1)

Published 24 Dec 2020 in cs.CL

Abstract: We investigate why neural machine translation (NMT) systems assign high probability to empty translations. We find two explanations. First, label smoothing makes correct-length translations less confident, making it easier for the empty translation to finally outscore them. Second, NMT systems use the same, high-frequency EoS word to end all target sentences, regardless of length. This creates an implicit smoothing that increases zero-length translations. Using different EoS types in target sentences of different lengths exposes and eliminates this implicit smoothing.

Authors (3)

Xing Shi (20 papers)
Yijun Xiao (10 papers)
Kevin Knight (29 papers)

Citations (9)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Why Neural Machine Translation Prefers Empty Outputs (2012.13454v1)

Summary

Related Papers