Terminology-Aware Translation with Constrained Decoding and Large Language Model Prompting (2310.05824v1)

Published 9 Oct 2023 in cs.CL

Abstract: Terminology correctness is important in the downstream application of machine translation, and a prevalent way to ensure this is to inject terminology constraints into a translation system. In our submission to the WMT 2023 terminology translation task, we adopt a translate-then-refine approach which can be domain-independent and requires minimal manual efforts. We annotate random source words with pseudo-terminology translations obtained from word alignment to first train a terminology-aware model. Further, we explore two post-processing methods. First, we use an alignment process to discover whether a terminology constraint has been violated, and if so, we re-decode with the violating word negatively constrained. Alternatively, we leverage a LLM to refine a hypothesis by providing it with terminology constraints. Results show that our terminology-aware model learns to incorporate terminologies effectively, and the LLM refinement process can further improve terminology recall.

PDF Abstract

Summarize PDF Markdown Bookmark Chat (Pro)

Authors (2)

Nikolay Bogoychev (17 papers)
Pinzhen Chen (27 papers)

Citations (9)

View on Semantic Scholar

Terminology-Aware Translation with Constrained Decoding and Large Language Model Prompting (2310.05824v1)

Related Papers