Don't Just Say "I don't know"! Self-aligning Large Language Models for Responding to Unknown Questions with Explanations (2402.15062v2)

Published 23 Feb 2024 in cs.CL and cs.LG

Abstract: Despite the remarkable abilities of LLMs to answer questions, they often display a considerable level of overconfidence even when the question does not have a definitive answer. To avoid providing hallucinated answers to these unknown questions, existing studies typically investigate approaches to refusing to answer these questions. In this work, we propose a novel and scalable self-alignment method to utilize the LLM itself to enhance its response-ability to different types of unknown questions, being capable of not only refusing to answer but also providing explanation to the unanswerability of unknown questions. Specifically, the Self-Align method first employ a two-stage class-aware self-augmentation approach to generate a large amount of unknown question-response data. Then we conduct disparity-driven self-curation to select qualified data for fine-tuning the LLM itself for aligning the responses to unknown questions as desired. Experimental results on two datasets across four types of unknown questions validate the superiority of the Self-Align method over existing baselines in terms of three types of task formulation.

PDF HTML Abstract

Summarize Bookmark Chat (Pro)

References (38)

Authors (5)

Yang Deng (113 papers)
Yong Zhao (194 papers)
Moxin Li (13 papers)
See-Kiong Ng (103 papers)
Tat-Seng Chua (360 papers)

Citations (5)

View on Semantic Scholar

Tweets

https://twitter.com/Jsevillamol/status/1887051734570140159

Don't Just Say "I don't know"! Self-aligning Large Language Models for Responding to Unknown Questions with Explanations (2402.15062v2)

Related Papers

Tweets