Generation Constraint Scaling Can Mitigate Hallucination (2407.16908v1)

Published 23 Jul 2024 in cs.CL, cs.AI, and cs.LG

Abstract: Addressing the issue of hallucinations in LLMs is a critical challenge. As the cognitive mechanisms of hallucination have been related to memory, here we explore hallucination for LLM that is enabled with explicit memory mechanisms. We empirically demonstrate that by simply scaling the readout vector that constrains generation in a memory-augmented LLM decoder, hallucination mitigation can be achieved in a training-free manner. Our method is geometry-inspired and outperforms a state-of-the-art LLM editing method on the task of generation of Wikipedia-like biography entries both in terms of generation quality and runtime complexity.

References (9)

Summary

The paper demonstrates that geometric scaling of memory readout vectors in LLMs significantly reduces hallucination, outperforming the GRACE method using higher RougeL and Jaccard scores.
It employs a training-free approach that uses lightweight matrix multiplications for memory operations, ensuring efficient synthesis of WikiBio entries.
Results reveal that an optimal fixed scaling factor between 3 and 4 enhances factual accuracy and runtime efficiency compared to iterative model-editing techniques.

Generation Constraint Scaling Can Mitigate Hallucination

The paper "Generation Constraint Scaling Can Mitigate Hallucination" by Kollias et al. explores a novel method for addressing the issue of hallucination in LLMs, particularly those augmented with explicit memory mechanisms. The research builds upon the premise that hallucinations in language generation can often be linked to the model's memory characteristics. By leveraging geometry-inspired scaling of readout vectors in a memory-augmented LLM decoder, the authors demonstrate an effective, training-free approach to mitigating hallucinations, which they empirically validate against the state-of-the-art (SOTA) model editing method known as GRACE.

The problem of hallucination in LLMs manifests as the generation of text that is factually incorrect or fabricated. Existing mitigation strategies include model editing, which involves modifying model parameters to correct specific outputs, and context-grounding, where the required factual context is included within the input prompt. Both involve significant computational costs and intuitive challenges.

The paper references Larimar, a memory-augmented LLM decoder that incorporates an external episodic memory controller. This architecture includes an encoder, associative memory module, and decoder, with memory readout vectors used to adjust the decoder based on the prompt. In contrast, GRACE is another model editing technique that employs dynamic key-value adapters installed at various layers of the model.

Methodology

To investigate the effectiveness of the proposed hallucination mitigation method, the authors employ the WikiBio dataset, a benchmark comprising Wikipedia-like biographies generated by GPT-3, annotated for factual accuracy. Their evaluation involves several steps:

Informing Models: The models are informed of (prompt, input) pairs derived from WikiBio entries.
Generating Output: The models then generate new sentences based solely on the prompt.
Constructing Entries: The generated sentences are concatenated to form new, synthesized WikiBio entries.

The Larimar model, augmented with an explicit memory mechanism, is compared to the GRACE model, an existing SOTA method. The primary metrics for evaluation are RougeL scores and Jaccard similarity index calculated against the actual WikiBio entries.

Results

Initial evaluations reveal that the base performance of Larimar does not outperform GRACE, with RougeL and Jaccard similarity scores lower across the board. However, in an ideal scenario where readout vectors align perfectly with write vectors, Larimar significantly outperforms GRACE, highlighting the potential of memory-readout scalability.

To explore this, the authors scale the readout vectors by varying factors, discovering that a fixed scaling factor in the range of 3 to 4 optimally minimizes hallucination. Specifically, Larimar demonstrates significantly higher RougeL scores (0.72 with a scaling factor of 4 compared to 0.49 for GRACE) and better Jaccard similarity outcomes, confirming the hypothesis that geometric alignment of readout vectors can substantially mitigate hallucination.

Complexity Considerations

On the computational complexity front, the relative efficiency of Larimar's memory operations stands out. While the GRACE model requires iterative backpropagation steps for model edits, Larimar employs lightweight matrix multiplications for memory writes and reads, leading to much faster synthesis of WikiBio entries (3.1 seconds for Larimar compared to 37.8-162.5 seconds for GRACE).

Discussion

The paper posits that the ability to constrain generation by scaling readout vectors in Larimar presents a promising, training-free opportunity for hallucination mitigation. This method's effectiveness hinges on its applicability to memory-augmented models and may not extend to architectures devoid of explicit memory mechanisms. Nonetheless, the significant improvements in accuracy and runtime efficiency underscore the potential for further research into geometry-inspired operations in LLM architectures.

Future work may explore optimizing scaling parameters and exploring adaptive scaling strategies across various datasets and model architectures. Additionally, extending this approach to other memory-augmented LLMs and expanding its usability across different language generation tasks could lead to broader applicability and enhanced performance in mitigating hallucinations.

References

Kollias, G., Das, P., Chaudhury, S. "Generation Constraint Scaling Can Mitigate Hallucination."
Das, P., et al. "Larimar: LLMs with episodic memory control." 2024.
Hartvigsen, T., Sankaranarayanan, S., Palangi, H., Kim, Y., Ghassemi, M. "Aging with grace: Lifelong model editing with discrete key-value adaptors." 2022.
Manakul, P., Liusie, A., Gales, M. "Selfcheckgpt: Zero-resource black-box hallucination detection for generative LLMs." 2023.

This paper elucidates a novel direction for mitigating hallucination in LLMs, advocating for simple, yet effective geometry-based methods that augment current approaches, offering both practical and theoretical advancements in AI research.