Papers
Topics
Authors
Recent
Search
2000 character limit reached

Calibration of Encoder Decoder Models for Neural Machine Translation

Published 3 Mar 2019 in cs.LG, cs.CL, and stat.ML | (1903.00802v1)

Abstract: We study the calibration of several state of the art neural machine translation(NMT) systems built on attention-based encoder-decoder models. For structured outputs like in NMT, calibration is important not just for reliable confidence with predictions, but also for proper functioning of beam-search inference. We show that most modern NMT models are surprisingly miscalibrated even when conditioned on the true previous tokens. Our investigation leads to two main reasons -- severe miscalibration of EOS (end of sequence marker) and suppression of attention uncertainty. We design recalibration methods based on these signals and demonstrate improved accuracy, better sequence-level calibration, and more intuitive results from beam-search.

Citations (91)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (2)

Collections

Sign up for free to add this paper to one or more collections.