Non-Monotonic Latent Alignments for CTC-Based Non-Autoregressive Machine Translation (2210.03953v1)

Published 8 Oct 2022 in cs.CL

Abstract: Non-autoregressive translation (NAT) models are typically trained with the cross-entropy loss, which forces the model outputs to be aligned verbatim with the target sentence and will highly penalize small shifts in word positions. Latent alignment models relax the explicit alignment by marginalizing out all monotonic latent alignments with the CTC loss. However, they cannot handle non-monotonic alignments, which is non-negligible as there is typically global word reordering in machine translation. In this work, we explore non-monotonic latent alignments for NAT. We extend the alignment space to non-monotonic alignments to allow for the global word reordering and further consider all alignments that overlap with the target sentence. We non-monotonically match the alignments to the target sentence and train the latent alignment model to maximize the F1 score of non-monotonic matching. Extensive experiments on major WMT benchmarks show that our method substantially improves the translation performance of CTC-based models. Our best model achieves 30.06 BLEU on WMT14 En-De with only one-iteration decoding, closing the gap between non-autoregressive and autoregressive models.

Authors (2)

Chenze Shao (22 papers)
Yang Feng (230 papers)

Citations (19)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Non-Monotonic Latent Alignments for CTC-Based Non-Autoregressive Machine Translation (2210.03953v1)

Summary

Related Papers