Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Pointer-based Fusion of Bilingual Lexicons into Neural Machine Translation (1909.07907v1)

Published 17 Sep 2019 in cs.CL

Abstract: Neural machine translation (NMT) systems require large amounts of high quality in-domain parallel corpora for training. State-of-the-art NMT systems still face challenges related to out-of-vocabulary words and dealing with low-resource language pairs. In this paper, we propose and compare several models for fusion of bilingual lexicons with an end-to-end trained sequence-to-sequence model for machine translation. The result is a fusion model with two information sources for the decoder: a neural conditional LLM and a bilingual lexicon. This fusion model learns how to combine both sources of information in order to produce higher quality translation output. Our experiments show that our proposed models work well in relatively low-resource scenarios, and also effectively reduce the parameter size and training cost for NMT without sacrificing performance.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Jetic Gū (3 papers)
  2. Hassan S. Shavarani (6 papers)
  3. Anoop Sarkar (11 papers)
Citations (4)

Summary

We haven't generated a summary for this paper yet.