Trapping LLM Hallucinations Using Tagged Context Prompts (2306.06085v1)

Published 9 Jun 2023 in cs.CL and cs.AI

Abstract: Recent advances in LLMs, such as ChatGPT, have led to highly sophisticated conversation agents. However, these models suffer from "hallucinations," where the model generates false or fabricated information. Addressing this challenge is crucial, particularly with AI-driven platforms being adopted across various sectors. In this paper, we propose a novel method to recognize and flag instances when LLMs perform outside their domain knowledge, and ensuring users receive accurate information. We find that the use of context combined with embedded tags can successfully combat hallucinations within generative LLMs. To do this, we baseline hallucination frequency in no-context prompt-response pairs using generated URLs as easily-tested indicators of fabricated data. We observed a significant reduction in overall hallucination when context was supplied along with question prompts for tested generative engines. Lastly, we evaluated how placing tags within contexts impacted model responses and were able to eliminate hallucinations in responses with 98.88% effectiveness.

PDF Abstract

The paper "Trapping LLM Hallucinations Using Tagged Context Prompts" (Feldman et al., 2023 ) addresses the significant problem of hallucinations in LLMs, where models generate false or fabricated information. This issue is critical, especially as LLMs are increasingly used in domains requiring high accuracy like legal, medical, and educational fields.

The authors propose a novel method to combat these hallucinations: using context combined with embedded tags. The core idea is to provide the LLM with specific information (context) relevant to the query and embed unique identifiers (tags) within this context. By instructing the LLM to reference these tags when answering, the generated response can be verified against the provided "known good" sources, thereby trapping or flagging information not grounded in the context.

The methodology involved several steps:

Data Set Creation: A dataset of questions and contexts was generated. Contexts were derived from summarized Wikipedia articles on various topics (France, chess, etc.) and sections from a recently published book on computational sociology. This allowed for testing both domains the LLM was likely trained on and domains it was unlikely to know about (the book).
Verification: Contexts were manually cross-referenced with original sources, and questions were checked for relevance and coherence with their corresponding contexts.
Tag Placement: A script automatically inserted unique (source xxxx) tags at the end of each sentence within the context prompts.
Experiments: Prompts were tested across various OpenAI models (including GPT-4, GPT-3.5-turbo, and older models). Three types of prompts were used:
- No-context questions, asking the model to provide details and sources.
- Tagged-context questions with relevant context.
- Tagged-context questions with mismatched context (context irrelevant to the question).
Data Collection and Analysis: Responses were stored and analyzed. Hallucination was primarily measured by the generation of URLs in no-context responses and the presence/absence of correct tags in tagged-context responses. URLs were programmatically checked for validity (HTTP status 200) as a proxy for factual correctness. Tag usage was checked by verifying if generated responses cited the embedded tags.

The experiments yielded three key findings:

Hallucinations in No-context: When no context was provided, LLMs frequently hallucinated. Across 1,715 no-context prompt-response pairs, 2,445 unique URLs were generated, with a substantial majority (1,605) being incorrect or unreachable. Non-URL references were also often unverifiable or too general.
Influence of Context: Simply providing context, whether relevant or mismatched, dramatically reduced the generation of spurious URLs. With context, only 48 URLs were generated across 1,715 prompt-response pairs (compared to 2,445 without context). This suggests LLMs tend to ground their responses in provided context, even if the context is irrelevant to the question (often resulting in the model stating the context is not applicable). The likelihood of hallucination (measured by URL generation) was reduced to approximately 2% compared to the no-context case.
Effect of Tags: Embedding tags within the context proved highly effective in allowing verification of generated responses. In 244 responses generated from relevant-context prompts with tags, only one failed to reference any of the supplied tags, meaning 99.6% of these responses were demonstrably grounded in the provided, tagged context. This effectively trapped hallucinations originating from outside the context. A rare edge case involved a mismatched context where the model referenced the tags but provided an irrelevant answer, highlighting the need for robust handling of such scenarios.

The authors conclude that context-based prompting, particularly with the addition of tags, is a powerful technique for anchoring LLM responses to provided sources, significantly reducing inappropriate hallucinations. They report that adding tags reduced hallucinations (validated by tag presence) by nearly 100% compared to the no-context baseline (specifically, 98.88% effectiveness by comparing the 1605 URL hallucinations in the no-context case to the 2 cases where tags were not used or used inappropriately in the tagged-context scenario).

Practical implications include building more trustworthy AI systems by ensuring responses are traceable to source material. The paper notes limitations: the method is not effective against prompt injection ("jailbreaking") attacks designed to make the model disregard instructions, and it does not address hallucinations if the provided context itself is inaccurate or "poisoned." However, poisoning the context shifts the problem to one of source verification, which is arguably more manageable than dealing with latent biases or fabrications within the LLM's training data. Future work could explore handling edge cases, testing on other models, and optimizing prompt design for tagged contexts.

PDF Markdown Bookmark Chat (Pro)

Authors (3)

Philip Feldman (19 papers)
James R. Foulds (12 papers)
Shimei Pan (28 papers)

Citations (38)

View on Semantic Scholar

Trapping LLM Hallucinations Using Tagged Context Prompts (2306.06085v1)

Related Papers