The Curious Case of Hallucinations in Neural Machine Translation (2104.06683v1)

Published 14 Apr 2021 in cs.CL, cs.AI, and cs.LG

Abstract: In this work, we study hallucinations in Neural Machine Translation (NMT), which lie at an extreme end on the spectrum of NMT pathologies. Firstly, we connect the phenomenon of hallucinations under source perturbation to the Long-Tail theory of Feldman (2020), and present an empirically validated hypothesis that explains hallucinations under source perturbation. Secondly, we consider hallucinations under corpus-level noise (without any source perturbation) and demonstrate that two prominent types of natural hallucinations (detached and oscillatory outputs) could be generated and explained through specific corpus-level noise patterns. Finally, we elucidate the phenomenon of hallucination amplification in popular data-generation processes such as Backtranslation and sequence-level Knowledge Distillation.

Citations (173)

View on Semantic Scholar

Summary

The paper shows a strong correlation between high memorization values in Neural Machine Translation (NMT) models and the occurrence of hallucinations when source text is perturbed.
Specific noise patterns within NMT training corpora, such as Repeat-Unique, are shown to correlate with distinct types of hallucinations, like oscillatory hallucinations.
The study finds that hallucinations can be significantly amplified when using data generated by methods like Backtranslation and sequence-level Knowledge Distillation.

An Analysis of Hallucinations in Neural Machine Translation

The research paper titled "The Curious Case of Hallucinations in Neural Machine Translation" presents a comprehensive paper of the phenomenon of hallucinations in Neural Machine Translation (NMT). This paper addresses a distinct pathology within NMT systems: the generation of translations that are syntactically fluent but lack adequate correspondence with the source text—phenomenon described as hallucinations.

Key Contributions and Findings

The authors investigate hallucinations from multiple angles, exploring how different types of perturbations and noise affect NMT outputs. The research details three primary contributions:

Hallucinations Under Source Perturbation: The authors explore how hallucinations occur due to source perturbations and relate this to Feldman's Long-Tail theory. They adapt the Memorization Value Estimator (MVE) to the sequence-to-sequence setting, utilizing metrics such as chrF to measure memorization and hallucination tendencies. The paper reveals a strong positive correlation between high memorization values and the occurrence of hallucinations under perturbations.
Corpus-Level Noise and Hallucinations: The paper explores how different noise patterns within training data correlate with distinct hallucination types. The authors create four types of noise patterns: Unique-Unique (UU), Unique-Repeat (UR), Repeat-Unique (RU), and Repeat-Repeat (RR), to simulate various real-world corpus-level noise scenarios. The findings highlight that specific noise patterns (like RU) lead to specific hallucination manifestations, such as oscillatory hallucinations (OH).
Hallucination Amplification in Data Generation: The amplification of hallucinations through the use of Backtranslation (BT) and sequence-level Knowledge Distillation (KD) is examined. The paper provides evidence that UR-type noise in training models leads to significant amplification of hallucinations in both KD and BT, affecting subsequent NMT models trained on such data.

Implications

The research carries significant theoretical and practical implications for the field of NMT and broader AI systems:

Theoretical Implications: The paper extends upon Feldman's Long-Tail theory by empirically showcasing its relevance in explaining hallucinations in NMT. The association between memorized samples and hallucination occurrence under perturbations enriches the understanding of how neural networks leverage generalization versus memorization, particularly in sequence-to-sequence models.
Practical Implications: The paper highlights the importance of robust data preprocessing and noise filtration techniques in corpus collection. It suggests leveraging countermeasures such as data augmentation, enhancement of learning algorithms to mitigate memorization, and the integration of robust noise-resilient training methodologies to reduce hallucination occurrences.

Future Directions

This comprehensive engagement with NMT hallucinations opens multiple potential research avenues:

Robustification Techniques: Developing data augmentation and robust training algorithms specifically targeting the long-tail memorization problem can greatly enhance NMT systems' stability and reliability against hallucinations.
Noise Detection and Filtering: Automating reliable detection of corpus-level noise can prevent the training of models on erroneous data. Advanced filtering techniques informed by this paper could be integrated into standard NMT training pipelines.
Integration of Advanced Diagnostic Tools: Building upon the findings regarding noise patterns and hallucinations, as well as the novel application of the MVE in sequence scenarios, NMT systems could be enhanced with sophisticated diagnostic tools to predict and mitigate potential hallucinations systematically.

In sum, the paper effectively elucidates the multi-faceted issue of hallucinations in NMT, connecting memorization phenomena with hallucination occurrences, and delineating how corpus-level noise influences NMT outputs. The insights derived from this paper hold the promise of significantly advancing both the theoretical understanding and the practical capabilities of modern NMT systems.