Osiris: A Lightweight Open-Source Hallucination Detection System (2505.04844v1)

Published 7 May 2025 in cs.CL

Abstract: Retrieval-Augmented Generation (RAG) systems have gained widespread adoption by application builders because they leverage sources of truth to enable LLMs to generate more factually sound responses. However, hallucinations, instances of LLM responses that are unfaithful to the provided context, often prevent these systems from being deployed in production environments. Current hallucination detection methods typically involve human evaluation or the use of closed-source models to review RAG system outputs for hallucinations. Both human evaluators and closed-source models suffer from scaling issues due to their high costs and slow inference speeds. In this work, we introduce a perturbed multi-hop QA dataset with induced hallucinations. Via supervised fine-tuning on our dataset, we achieve better recall with a 7B model than GPT-4o on the RAGTruth hallucination detection benchmark and offer competitive performance on precision and accuracy, all while using a fraction of the parameters. Code is released at our repository.

PDF Abstract

Osiris: A Lightweight Open-Source Hallucination Detection System

The paper "Osiris: A Lightweight Open-Source Hallucination Detection System" tackles a pertinent challenge in AI-driven text generation, specifically focusing on retrieval-augmented generation (RAG) systems in LLMs—hallucinations. Hallucinations here refer to model-generated responses that lack fidelity to the provided context.

Introduction

The research addresses limitations in existing hallucination detection methods employed in RAG systems, which often rely on human evaluations or closed-source models for analysis. These conventional approaches are plagued by scalability issues, primarily due to their significant costs and sluggish inference speeds. The introduction of Osiris-7B emerges as a solution to mitigate these challenges, offering enhanced recall in hallucination detection via supervised fine-tuning while maintaining competitive precision and accuracy.

Methodology

Osiris-7B is trained using a perturbed multi-hop QA dataset, a crucial development extending beyond traditional single-document question-answering frameworks. This methodology positions Osiris-7B to excel in contexts involving complex information synthesis across multiple documents, enhancing its ability to discern responses unsupported by retrieved evidence. The dataset construction incorporates multi-hop questions from sources such as MuSiQue, which demand synthesizing information from several documents to arrive at answers, thus advancing the model's ability to perform multi-hop reasoning and detect hallucinations effectively.

The dataset perturbs verified questions to induce hallucinations, thereby training Osiris-7B to distinguish between plausible yet unsupported claims and genuinely fact-supported answers. This innovative approach fine-tunes the model's reasoning capabilities, leveraging structured multi-hop frameworks that enforce explicit evidence retrieval across multiple steps.

Results

Empirical evidence indicates Osiris-7B significantly surpasses GPT-4o in recall, achieving an improvement of 22.8% on the RAGTruth benchmark. The model's architecture prioritizes recall, essential in industry scenarios where detecting all possible hallucinations is more critical than precision. Osiris-7B's agility in processing is reflected through its faster inference speed—141.98 tokens per second compared to GPT-4o's 97 tokens per second, marking it as a practical solution for real-time application in industry settings.

Contributions and Implications

The paper outlines several key contributions:

Osiris-7B demonstrates superior recall in detecting hallucinations, making it viable for production environments where traditional methods fall short.
It achieves significant performance enhancement with lower computational demands, rendering it suitable for deployments requiring efficiency and speed.

The theoretical implications of this research indicate a promising direction towards developing more reliable LLM-driven systems that can ensure factual correctness in outputs, particularly in sensitive domains such as healthcare and finance. The practical implications involve advancing open-source solutions that can be scaled effectively, providing a robust alternative to costly, closed-source models.

Conclusion and Future Research

The paper presents a compelling case for Osiris-7B as a promising development in the detection of unfaithful machine-generated text. Its emphasis on recall and speed tackles existing criticisms well, although precision remains a domain requiring future enhancement. Future research might explore integrating domain-specific datasets to bolster precision and balance between false positives and negatives, alongside continued exploration into refining structured reasoning frameworks to overcome limitations inherent in single-hop approaches.

Osiris-7B stands out as an efficient, scalable alternative in the hallucinatory detection landscape, presenting open opportunities for further exploration in AI reliability and efficiency enhancements.

PDF Markdown Bookmark Chat (Pro)

Authors (3)

Alex Shan (3 papers)
John Bauer (4 papers)
Christopher D. Manning (169 papers)

Related Papers

Find Related Papers

YouTube

Show All Videos