2000 character limit reached
Document-level Neural Machine Translation with Document Embeddings (2009.08775v1)
Published 16 Sep 2020 in cs.CL
Abstract: Standard neural machine translation (NMT) is on the assumption of document-level context independent. Most existing document-level NMT methods are satisfied with a smattering sense of brief document-level information, while this work focuses on exploiting detailed document-level context in terms of multiple forms of document embeddings, which is capable of sufficiently modeling deeper and richer document-level context. The proposed document-aware NMT is implemented to enhance the Transformer baseline by introducing both global and local document-level clues on the source end. Experiments show that the proposed method significantly improves the translation performance over strong baselines and other related studies.
- Shu Jiang (18 papers)
- Hai Zhao (227 papers)
- Zuchao Li (76 papers)
- Bao-Liang Lu (26 papers)