A Survey on the Honesty of Large Language Models (2409.18786v1)
Abstract: Honesty is a fundamental principle for aligning LLMs with human values, requiring these models to recognize what they know and don't know and be able to faithfully express their knowledge. Despite promising, current LLMs still exhibit significant dishonest behaviors, such as confidently presenting wrong answers or failing to express what they know. In addition, research on the honesty of LLMs also faces challenges, including varying definitions of honesty, difficulties in distinguishing between known and unknown knowledge, and a lack of comprehensive understanding of related research. To address these issues, we provide a survey on the honesty of LLMs, covering its clarification, evaluation approaches, and strategies for improvement. Moreover, we offer insights for future research, aiming to inspire further exploration in this important area.
- Siheng Li (20 papers)
- Cheng Yang (168 papers)
- Taiqiang Wu (21 papers)
- Chufan Shi (15 papers)
- Yuji Zhang (14 papers)
- Xinyu Zhu (29 papers)
- Zesen Cheng (24 papers)
- Deng Cai (181 papers)
- Mo Yu (117 papers)
- Lemao Liu (62 papers)
- Jie Zhou (687 papers)
- Yujiu Yang (155 papers)
- Ngai Wong (82 papers)
- Xixin Wu (85 papers)
- Wai Lam (117 papers)