Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Fuzzy Alignments in Directed Acyclic Graph for Non-Autoregressive Machine Translation (2303.06662v2)

Published 12 Mar 2023 in cs.CL

Abstract: Non-autoregressive translation (NAT) reduces the decoding latency but suffers from performance degradation due to the multi-modality problem. Recently, the structure of directed acyclic graph has achieved great success in NAT, which tackles the multi-modality problem by introducing dependency between vertices. However, training it with negative log-likelihood loss implicitly requires a strict alignment between reference tokens and vertices, weakening its ability to handle multiple translation modalities. In this paper, we hold the view that all paths in the graph are fuzzily aligned with the reference sentence. We do not require the exact alignment but train the model to maximize a fuzzy alignment score between the graph and reference, which takes captured translations in all modalities into account. Extensive experiments on major WMT benchmarks show that our method substantially improves translation performance and increases prediction confidence, setting a new state of the art for NAT on the raw training data.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Zhengrui Ma (18 papers)
  2. Chenze Shao (22 papers)
  3. Shangtong Gui (4 papers)
  4. Min Zhang (630 papers)
  5. Yang Feng (230 papers)
Citations (14)

Summary

We haven't generated a summary for this paper yet.