Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Negative Lexical Constraints in Neural Machine Translation (2308.03601v1)

Published 7 Aug 2023 in cs.CL

Abstract: This paper explores negative lexical constraining in English to Czech neural machine translation. Negative lexical constraining is used to prohibit certain words or expressions in the translation produced by the neural translation model. We compared various methods based on modifying either the decoding process or the training data. The comparison was performed on two tasks: paraphrasing and feedback-based translation refinement. We also studied to which extent these methods "evade" the constraints presented to the model (usually in the dictionary form) by generating a different surface form of a given constraint.We propose a way to mitigate the issue through training with stemmed negative constraints to counter the model's ability to induce a variety of the surface forms of a word that can result in bypassing the constraint. We demonstrate that our method improves the constraining, although the problem still persists in many cases.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Josef Jon (12 papers)
  2. Dušan Variš (10 papers)
  3. Michal Novák (8 papers)
  4. João Paulo Aires (6 papers)
  5. Ondřej Bojar (91 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.