Shared Imagination: LLMs Hallucinate Alike (2407.16604v1)

Published 23 Jul 2024 in cs.CL

Abstract: Despite the recent proliferation of LLMs, their training recipes -- model architecture, pre-training data and optimization algorithm -- are often very similar. This naturally raises the question of the similarity among the resulting models. In this paper, we propose a novel setting, imaginary question answering (IQA), to better understand model similarity. In IQA, we ask one model to generate purely imaginary questions (e.g., on completely made-up concepts in physics) and prompt another model to answer. Surprisingly, despite the total fictionality of these questions, all models can answer each other's questions with remarkable success, suggesting a "shared imagination space" in which these models operate during such hallucinations. We conduct a series of investigations into this phenomenon and discuss implications on model homogeneity, hallucination, and computational creativity.

Summary

The paper introduces the Imaginary Question Answering framework, demonstrating that LLMs generate and answer fictional queries with notable accuracy.
It shows that models from the same family achieve higher correctness—up to 86% on context-based questions—indicating a shared latent space.
The study highlights challenges in hallucination detection and raises questions about the limits of computational creativity in modern LLMs.

Shared Imagination: LLMs Hallucinate Alike

The paper "Shared Imagination: LLMs Hallucinate Alike" by Yilun Zhou, Caiming Xiong, Silvio Savarese, and Chien-Sheng Wu, explores an intriguing phenomenon among LLMs: their propensity to generate and answer entirely fictional questions with high accuracy, indicating a shared latent space or "shared imagination."

Introduction and Motivation

The primary motivation behind this paper is to investigate the extent to which LLMs, which share similar training recipes (e.g., model architecture, pre-training data, and optimization algorithms), also share inherent similarities in their outputs. Specifically, the authors introduce a novel experimental framework, Imaginary Question Answering (IQA), to probe these similarities through the generation and evaluation of purely imaginary questions.

Imaginary Question Answering Framework

The IQA framework involves two main roles for the models: the Question Model (QM) and the Answer Model (AM). The QM generates a set of multiple-choice questions based on completely fictional concepts, while the AM attempts to answer these questions. The process includes two types of question generation:

Direct Question (DQ): The model generates a standalone fictional question.
Context-based Question (CQ): The model first generates a fictional context paragraph and then formulates a question based on it.

Experimental Setup and Results

The experiments involved 13 different LLMs from four major model families (GPT, Claude, Mistral, and Llama 3), evaluating their performance on IQA tasks across multiple topics, such as physics, literature, and economics.

Key findings include:

On average, models achieved a 54% correctness rate for DQs and a significantly higher 86% for CQs, compared to a random chance of 25%.
Higher accuracy was observed when the QM and AM were from the same model family or were the same model.
The models displayed a high level of consistency in their responses, suggesting a surprising degree of agreement on imaginary content.

In-Depth Analyses

The paper investigates several research questions to explore this phenomenon:

Data Characteristics: Despite different models generating questions, there is a notable homogeneity in the structure and embeddings of these questions across various topics.
Heuristics for Correct Choice: Models exhibited non-trivial heuristics, such as preferring the longest answer option, but these were insufficient alone to explain the high correctness rates.
Fictionality Awareness: Models often answered fictional questions as if they were real, although they could detect fictionality when directly asked about it.
Effect of Model "Warm-Up": Both generating multiple sequential questions and question length positively influenced the models' accuracy.
Universality of the Phenomenon: Earlier models and those without instruction tuning did not exhibit the same behavior, suggesting that pre-training data and recent tuning play critical roles.
Other Content Types: The same high correctness rates were observed in fictional creative writing tasks, showing the behavior extends beyond knowledge-based hallucinations.

Implications and Future Work

These findings have several theoretical and practical implications:

Model Homogeneity: The high degree of agreement among different LLMs on fictional content suggests underlying homogeneity, which could affect how we understand and interpret their outputs.
Hallucination Detection: The results imply that detecting hallucinations in LLMs might be more challenging and would require advanced methodologies.
Computational Creativity: The shared imagination space raises questions about the true extent of creativity that LLMs can exhibit, pointing toward potential limits.

Future research could expand these investigations by including additional model families, exploring different content types, and employing interpretability analyses to understand the underlying mechanisms of this shared imagination space.

Conclusion

The paper presents a thorough examination of the "shared imagination" among LLMs, uncovering surprising similarities in their behavior when generating and answering fictional questions. These insights contribute to a deeper understanding of LLM capabilities and highlight important areas for future research in AI.

PDF Markdown

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Generate Now

Related Papers

Authors (4)

Tweets

https://twitter.com/silviocinguetta/status/1816334692574191894

https://twitter.com/YilunZhou/status/1927735867126227057

https://twitter.com/fly51fly/status/1816102306527732067

https://twitter.com/SFResearch/status/1816225507752415250

https://twitter.com/GptMaestro/status/1816926930190614890

https://twitter.com/susumuota/status/1816625718094119271