Non-parametric, Nearest-neighbor-assisted Fine-tuning for Neural Machine Translation (2305.13648v1)

Published 23 May 2023 in cs.CL and cs.AI

Abstract: Non-parametric, k-nearest-neighbor algorithms have recently made inroads to assist generative models such as LLMs and machine translation decoders. We explore whether such non-parametric models can improve machine translation models at the fine-tuning stage by incorporating statistics from the kNN predictions to inform the gradient updates for a baseline translation model. There are multiple methods which could be used to incorporate kNN statistics and we investigate gradient scaling by a gating mechanism, the kNN's ground truth probability, and reinforcement learning. For four standard in-domain machine translation datasets, compared with classic fine-tuning, we report consistent improvements of all of the three methods by as much as 1.45 BLEU and 1.28 BLEU for German-English and English-German translations respectively. Through qualitative analysis, we found particular improvements when it comes to translating grammatical relations or function words, which results in increased fluency of our model.

Collections

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Generate Now

Non-parametric, Nearest-neighbor-assisted Fine-tuning for Neural Machine Translation (2305.13648v1)

Collections

Summary

Follow-up Questions

Authors (5)

Don't miss out on important new AI/ML research

Non-parametric, Nearest-neighbor-assisted Fine-tuning for Neural Machine Translation (2305.13648v1)

Collections

Summary

Follow-up Questions

Related Papers

Authors (5)

Don't miss out on important new AI/ML research