Do We Know What LLMs Don't Know? A Study of Consistency in Knowledge Probing (2505.21701v2)

Published 27 May 2025 in cs.CL

Abstract: The reliability of LLMs is greatly compromised by their tendency to hallucinate, underscoring the need for precise identification of knowledge gaps within LLMs. Various methods for probing such gaps exist, ranging from calibration-based to prompting-based methods. To evaluate these probing methods, in this paper, we propose a new process based on using input variations and quantitative metrics. Through this, we expose two dimensions of inconsistency in knowledge gap probing. (1) Intra-method inconsistency: Minimal non-semantic perturbations in prompts lead to considerable variance in detected knowledge gaps within the same probing method; e.g., the simple variation of shuffling answer options can decrease agreement to around 40%. (2) Cross-method inconsistency: Probing methods contradict each other on whether a model knows the answer. Methods are highly inconsistent -- with decision consistency across methods being as low as 7% -- even though the model, dataset, and prompt are all the same. These findings challenge existing probing methods and highlight the urgent need for perturbation-robust probing frameworks.

Summary

The paper analyzes significant intra-method and cross-method inconsistencies in current knowledge probing techniques for Large Language Models.
Intra-method inconsistency is high; minimal prompt variations can cause agreement percentages to drop to 31% within the same probing method.
Cross-method inconsistency is even higher, with different methods showing approximately 7% agreement when applied to the same LLM and dataset.

Exploring Inconsistencies in Knowledge Probing of LLMs

The paper "Do We Know What LLMs Don't Know? A Study of Consistency in Knowledge Probing" by Raoyuan Zhao et al. provides an insightful analysis into the inconsistencies found in current methodologies used for probing the knowledge gaps in LLMs. The exploration is particularly centered around two dimensions of inconsistency: intra-method and cross-method inconsistencies. The research is motivated by the pressing need to address the hallucinations in LLMs—where models generate factually incorrect data despite producing fluent responses.

Core Findings

Intra-method Inconsistency: The research identifies a significant level of inconsistency within the same probing method when minimal non-semantic prompt variations, such as shuffling answer options, are introduced. Findings indicate that such perturbations can cause agreement percentages to drop as low as 31%.
Cross-method Inconsistency: The paper demonstrates that different probing methods applied to the same LLM frequently produce contradictory results, with consistency between methods being alarmingly low, at approximately 7%. This inconsistency persists despite staying consistent in model, dataset, and prompt across the methods.

Methodological Approach

The paper adopts a robust methodology by employing various zero-shot and one-shot perturbation-based prompts to assess probing methods' consistency. The probing methods analyzed include calibration-based, training-based, prompting-based, and consistency-based approaches. Specific methods tested include TokProb, AskCal, SelfRef, NOTA, and MoreInfo, among others. Each method's robustness is evaluated using multiple perturbations to test its sensitivity and stability.

Experimental Setup

The paper employs a variety of LLMs, including models from the LLaMA and Mistral families, across sizes ranging from 1B to 70B parameters. This diversity in models allows the researchers to understand the influence of model scaling on probing method stability. Furthermore, datasets used include MMLU and Hellaswag, offering diverse scenarios to assess the efficiency of each probing method under differing conditions.

Implications

The findings have far-reaching implications. Firstly, the results suggest that current knowledge probing methods are unreliable for robust knowledge gap detection, thereby casting doubt on the efficacy of these methods in practical applications. Secondly, since these inconsistencies undermine the interpretability and reliability of LLMs, there's an urgent need for novel perturbation-robust frameworks in probing knowledge gaps.

Prospect for Future Research

This research opens pathways for future exploration into developing more robust, perturbation-resistant probing methods. There is an evident need for frameworks that can maintain consistency across trivial perturbations while providing reliable assessments of LLMs' knowledge gaps. Future studies should focus on designing new metrics for evaluating probing consistency and robustness better, potentially contributing to more dependable applications of LLMs in critical real-world settings.

In sum, this paper keenly highlights the limitations inherent in present knowledge probing frameworks and sets the stage for more robust and reliable methods in the paper and application of LLMs.