Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Progressive Multi-Granularity Training for Non-Autoregressive Translation (2106.05546v2)

Published 10 Jun 2021 in cs.CL

Abstract: Non-autoregressive translation (NAT) significantly accelerates the inference process via predicting the entire target sequence. However, recent studies show that NAT is weak at learning high-mode of knowledge such as one-to-many translations. We argue that modes can be divided into various granularities which can be learned from easy to hard. In this study, we empirically show that NAT models are prone to learn fine-grained lower-mode knowledge, such as words and phrases, compared with sentences. Based on this observation, we propose progressive multi-granularity training for NAT. More specifically, to make the most of the training data, we break down the sentence-level examples into three types, i.e. words, phrases, sentences, and with the training goes, we progressively increase the granularities. Experiments on Romanian-English, English-German, Chinese-English, and Japanese-English demonstrate that our approach improves the phrase translation accuracy and model reordering ability, therefore resulting in better translation quality against strong NAT baselines. Also, we show that more deterministic fine-grained knowledge can further enhance performance.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Liang Ding (159 papers)
  2. Longyue Wang (87 papers)
  3. Xuebo Liu (54 papers)
  4. Derek F. Wong (69 papers)
  5. Dacheng Tao (829 papers)
  6. Zhaopeng Tu (135 papers)
Citations (41)

Summary

We haven't generated a summary for this paper yet.