Attention Link: An Efficient Attention-Based Low Resource Machine Translation Architecture (2302.00340v1)

Published 1 Feb 2023 in cs.CL and cs.AI

Abstract: Transformers have achieved great success in machine translation, but transformer-based NMT models often require millions of bilingual parallel corpus for training. In this paper, we propose a novel architecture named as attention link (AL) to help improve transformer models' performance, especially in low training resources. We theoretically demonstrate the superiority of our attention link architecture in low training resources. Besides, we have done a large number of experiments, including en-de, de-en, en-fr, en-it, it-en, en-ro translation tasks on the IWSLT14 dataset as well as real low resources scene on bn-gu and gu-ta translation tasks on the CVIT PIB dataset. All the experiment results show our attention link is powerful and can lead to a significant improvement. In addition, we achieve a 37.9 BLEU score, a new sota, on the IWSLT14 de-en task by combining our attention link and other advanced methods.

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Generate Now

Authors (1)

Zeping Min

Attention Link: An Efficient Attention-Based Low Resource Machine Translation Architecture (2302.00340v1)

Summary

Follow-up Questions

Related Papers

Authors (1)