Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Learning to Trust Your Feelings: Leveraging Self-awareness in LLMs for Hallucination Mitigation (2401.15449v1)

Published 27 Jan 2024 in cs.CL

Abstract: We evaluate the ability of LLMs to discern and express their internal knowledge state, a key factor in countering factual hallucination and ensuring reliable application of LLMs. We observe a robust self-awareness of internal knowledge state in LLMs, evidenced by over 85% accuracy in knowledge probing. However, LLMs often fail to express their internal knowledge during generation, leading to factual hallucinations. We develop an automated hallucination annotation tool, Dreamcatcher, which merges knowledge probing and consistency checking methods to rank factual preference data. Using knowledge preference as reward, We propose a Reinforcement Learning from Knowledge Feedback (RLKF) training framework, leveraging reinforcement learning to enhance the factuality and honesty of LLMs. Our experiments across multiple models show that RLKF training effectively enhances the ability of models to utilize their internal knowledge state, boosting performance in a variety of knowledge-based and honesty-related tasks.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Yuxin Liang (7 papers)
  2. Zhuoyang Song (4 papers)
  3. Hao Wang (1119 papers)
  4. Jiaxing Zhang (39 papers)
Citations (21)
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets