Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Reducing Hallucinations in Neural Machine Translation with Feature Attribution (2211.09878v2)

Published 17 Nov 2022 in cs.CL

Abstract: Neural conditional language generation models achieve the state-of-the-art in Neural Machine Translation (NMT) but are highly dependent on the quality of parallel training dataset. When trained on low-quality datasets, these models are prone to various error types, including hallucinations, i.e. outputs that are fluent, but unrelated to the source sentences. These errors are particularly dangerous, because on the surface the translation can be perceived as a correct output, especially if the reader does not understand the source language. We present a case study focusing on model understanding and regularisation to reduce hallucinations in NMT. We first use feature attribution methods to study the behaviour of an NMT model that produces hallucinations. We then leverage these methods to propose a novel loss function that substantially helps reduce hallucinations and does not require retraining the model from scratch.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (14)
  1. 2014. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.
  2. 2017. Is neural machine translation the new state of the art? The Prague Bulletin of Mathematical Linguistics, (108).
  3. 2020. Mlqe-pe: A multilingual quality estimation and post-editing dataset.
  4. 2019. Jointly learning to align and translate with transformer models. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 4453–4462, Hong Kong, China, November. Association for Computational Linguistics.
  5. 2020. Domain robustness in neural machine translation. In Proceedings of the 14th Conference of the Association for Machine Translation in the Americas (Volume 1: Research Track), pages 151–164, Virtual, October. Association for Machine Translation in the Americas.
  6. 2021. The curious case of hallucinations in neural machine translation. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1172–1183, Online, June. Association for Computational Linguistics.
  7. 2020. Restricting the flow: Information bottlenecks for attribution. In International Conference on Learning Representations.
  8. 2022. A taxonomy and study of critical errors in machine translation. In Proceedings of the 23rd Annual Conference of the European Association for Machine Translation, pages 171–180, Ghent, Belgium, June. European Association for Machine Translation.
  9. 2021. Findings of the WMT 2021 shared task on quality estimation. In Proceedings of the Sixth Conference on Machine Translation, pages 684–725, Online, November. Association for Computational Linguistics.
  10. 2017. Axiomatic attribution for deep networks. In Proceedings of the 34th International Conference on Machine Learning - Volume 70, ICML’17, page 3319–3328. JMLR.org.
  11. 2016. Modeling coverage for neural machine translation. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 76–85, Berlin, Germany, August. Association for Computational Linguistics.
  12. 2017. Attention is all you need. In Advances in neural information processing systems, pages 5998–6008.
  13. 2021. Analyzing the source and target contributions to predictions in neural machine translation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 1126–1140, Online, August. Association for Computational Linguistics.
  14. 2019. Neural text generation with unlikelihood training. CoRR, abs/1908.04319.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Joël Tang (4 papers)
  2. Marina Fomicheva (11 papers)
  3. Lucia Specia (68 papers)
Citations (6)