Selecting Artificially-Generated Sentences for Fine-Tuning Neural Machine Translation (1909.12016v1)

Published 26 Sep 2019 in cs.CL

Abstract: Neural Machine Translation (NMT) models tend to achieve best performance when larger sets of parallel sentences are provided for training. For this reason, augmenting the training set with artificially-generated sentence pairs can boost performance. Nonetheless, the performance can also be improved with a small number of sentences if they are in the same domain as the test set. Accordingly, we want to explore the use of artificially-generated sentences along with data-selection algorithms to improve German-to-English NMT models trained solely with authentic data. In this work, we show how artificially-generated sentences can be more beneficial than authentic pairs, and demonstrate their advantages when used in combination with data-selection algorithms.

PDF Abstract

Summarize Bookmark Chat (Pro)

Authors (2)

Alberto Poncelas (15 papers)
Andy Way (46 papers)

Citations (10)

View on Semantic Scholar

Selecting Artificially-Generated Sentences for Fine-Tuning Neural Machine Translation (1909.12016v1)

Related Papers