- The paper comprehensively surveys the knowledge boundaries of Large Language Models, defining the concept and proposing a formal four-type taxonomy for categorizing knowledge.
- It details methods for identifying these boundaries using techniques like uncertainty estimation, confidence calibration, and internal state probing.
- The study reviews strategies to mitigate issues arising from knowledge boundaries, such as prompt optimization, external knowledge retrieval, and mechanisms for model refusal.
The paper addresses the limitations of LLMs in knowledge memorization and utilization, which leads to untruthful or inaccurate responses. It proposes a comprehensive definition of the LLM knowledge boundary and a formalized taxonomy that categorizes knowledge into four distinct types. The paper systematically reviews the motivation for studying LLM knowledge boundaries, methods for identifying these boundaries, and strategies for mitigating the challenges that they present.
The paper defines three types of knowledge boundaries:
- Outward Knowledge Boundary: the observable knowledge boundary for a specific LLM, verifiable through a limited subset of expressions.
- Parametric Knowledge Boundary: the abstract knowledge boundary, where knowledge is embedded within the LLM parameters and verifiable by at least one expression.
- Universal Knowledge Boundary: the whole set of knowledge known to humans, verifiable by input-output pairs.
Based on these boundaries, the paper establishes a formal four-type knowledge taxonomy:
- Prompt-Agnostic Known Knowledge (PAK): Knowledge verifiable by all expressions, regardless of the prompt, where KPAK={k∈K∣∀(qki,aki)∈Q^k,Pθ(aki∣qki)>ϵ}.
- k: a piece of knowledge
- K: the whole set of abstracted knowledge that is known to human
- qki: input
- aki: output
- Q^k: a limited available subset of expressions
- Pθ: probability
- ϵ: a threshold
- Prompt-Sensitive Known Knowledge (PSK): Knowledge residing within the LLM's parameters but sensitive to the prompt, where KPSK={k∈K∣(∃(qki,aki)∈Qk,Pθ(aki∣qki)>ϵ)∧(∃(qki,aki)∈Q^k,Pθ(aki∣qki)<ϵ)}.
- Model-Specific Unknown Knowledge (MSU): Knowledge not possessed in the specific LLM parameters but known to humans, where KMSU={k∈K∣∀(qki,aki)∈Qk,Pθ(aki∣qki)<ϵ}.
- Model-Agnostic Unknown Knowledge (MAU): Knowledge unknown to both the model and humans, where KMAU={k∈K∣Qk=Ø}.
The authors discuss undesirable behaviors of LLMs that stem from unawareness of knowledge boundaries, such as factuality hallucinations, untruthful responses misled by context, and truthful but undesired responses. Factuality hallucinations arise from deficiencies in domain-specific knowledge, outdated information, and overconfidence in addressing unknowns. Moreover, LLMs often produce untruthful responses when misled by untruthful or irrelevant context. Ambiguous knowledge may lead to random responses, while controversial knowledge can result in biased outputs.
The survey categorizes solutions for identifying knowledge boundaries into uncertainty estimation (UE), confidence calibration, and internal state probing. UE quantifies the uncertainty of a model's predictions, decomposing it into epistemic and aleatoric uncertainty. Epistemic uncertainty quantifies the lack of model knowledge, while aleatoric uncertainty refers to data-level uncertainty. Confidence calibration aligns the estimated LLM confidence with actual correctness using prompt-based and fine-tuning approaches. Internal state probing assesses factual accuracy using linear probing on internal states, including attention heads, hidden layer activations, neurons, and tokens.
Strategies to mitigate issues caused by knowledge boundaries include prompt optimization, prompt-based reasoning, self-refinement, and factuality decoding for PSK. For MSU, mitigation involves external knowledge retrieval, parametric knowledge editing, and knowledge-enhanced fine-tuning. To address MAU, strategies include refusal and asking clarification questions.
The authors also highlight several challenges and prospects: the need for more comprehensive benchmarks to assess knowledge boundaries, the generalization of knowledge boundary identification across domains, the utilization of LLM knowledge boundaries in future developments, and the mitigation of unintended side effects such as over-refusal and unnecessary costs.