Papers
Topics
Authors
Recent
Search
2000 character limit reached

Don't Just Say "I don't know"! Self-aligning Large Language Models for Responding to Unknown Questions with Explanations

Published 23 Feb 2024 in cs.CL and cs.LG | (2402.15062v2)

Abstract: Despite the remarkable abilities of LLMs to answer questions, they often display a considerable level of overconfidence even when the question does not have a definitive answer. To avoid providing hallucinated answers to these unknown questions, existing studies typically investigate approaches to refusing to answer these questions. In this work, we propose a novel and scalable self-alignment method to utilize the LLM itself to enhance its response-ability to different types of unknown questions, being capable of not only refusing to answer but also providing explanation to the unanswerability of unknown questions. Specifically, the Self-Align method first employ a two-stage class-aware self-augmentation approach to generate a large amount of unknown question-response data. Then we conduct disparity-driven self-curation to select qualified data for fine-tuning the LLM itself for aligning the responses to unknown questions as desired. Experimental results on two datasets across four types of unknown questions validate the superiority of the Self-Align method over existing baselines in terms of three types of task formulation.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (38)
  1. Can NLP models ’identify’, ’distinguish’, and ’justify’ questions that don’t have a definitive answer? In TrustNLP Workshop at ACL 2023.
  2. Knowledge of knowledge: Exploring known-unknowns uncertainty with large language models. CoRR, abs/2305.13712.
  3. Self-rag: Learning to retrieve, generate, and critique through self-reflection. CoRR, abs/2310.11511.
  4. Semantic parsing on freebase from question-answer pairs. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, EMNLP 2013, pages 1533–1544. ACL.
  5. Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality.
  6. Prompting and evaluating large language models for proactive dialogues: Clarification, target-guided, and non-collaboration. In Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, December 6-10, 2023, pages 10602–10621. Association for Computational Linguistics.
  7. Yarin Gal and Zoubin Ghahramani. 2016. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, June 19-24, 2016, volume 48 of JMLR Workshop and Conference Proceedings, pages 1050–1059. JMLR.org.
  8. Teaching machines to read and comprehend. In Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7-12, 2015, Montreal, Quebec, Canada, pages 1693–1701.
  9. Decomposing uncertainty for large language models through input clarification ensembling. CoRR, abs/2311.08718.
  10. Lora: Low-rank adaptation of large language models. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. OpenReview.net.
  11. Large language models can self-improve. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023, Singapore, December 6-10, 2023, pages 1051–1068. Association for Computational Linguistics.
  12. A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions. CoRR, abs/2311.05232.
  13. Survey of hallucination in natural language generation. ACM Comput. Surv., 55(12):248:1–248:38.
  14. TEQUILA: temporal question answering over knowledge bases. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, CIKM 2018, Torino, Italy, October 22-26, 2018, pages 1807–1810. ACM.
  15. Active retrieval augmented generation. CoRR, abs/2305.06983.
  16. Self-alignment with instruction backtranslation. CoRR, abs/2308.06259.
  17. Generating with confidence: Uncertainty quantification for black-box large language models. CoRR, abs/2305.19187.
  18. Self-refine: Iterative refinement with self-feedback. CoRR, abs/2303.17651.
  19. Reducing conversational agents’ overconfidence through linguistic calibration. Trans. Assoc. Comput. Linguistics, 10:857–872.
  20. Semeval-2017 task 7: Detection and interpretation of english puns. In Proceedings of the 11th International Workshop on Semantic Evaluation, SemEval@ACL 2017, Vancouver, Canada, August 3-4, 2017, pages 58–68. Association for Computational Linguistics.
  21. Measuring and narrowing the compositionality gap in language models. CoRR, abs/2210.03350.
  22. Know what you don’t know: Unanswerable questions for squad. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, July 15-20, 2018, Volume 2: Short Papers, pages 784–789. Association for Computational Linguistics.
  23. Prompting GPT-3 to be reliable. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net.
  24. The curious case of hallucinatory unanswerablity: Finding truths in the hidden states of over-confident large language models. CoRR, abs/2310.11877.
  25. Context-situated pun generation. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7-11, 2022, pages 4635–4648. Association for Computational Linguistics.
  26. Principle-driven self-alignment of language models from scratch with minimal human supervision. In Thirty-seventh Conference on Neural Information Processing Systems.
  27. Just ask for calibration: Strategies for eliciting calibrated confidence scores from language models fine-tuned with human feedback. CoRR, abs/2305.14975.
  28. Llama 2: Open foundation and fine-tuned chat models. CoRR, abs/2307.09288.
  29. Musique: Multihop questions via single-hop question composition. Trans. Assoc. Comput. Linguistics, 10:539–554.
  30. Self-consistency improves chain of thought reasoning in language models. In ICLR 2023.
  31. Chain-of-thought prompting elicits reasoning in large language models. In NeurIPS 2022.
  32. Can llms express their uncertainty? an empirical evaluation of confidence elicitation in llms. CoRR, abs/2306.13063.
  33. Alignment for honesty. arXiv preprint arXiv:2312.07000.
  34. Tree of thoughts: Deliberate problem solving with large language models. CoRR, abs/2305.10601.
  35. React: Synergizing reasoning and acting in language models. In ICLR 2023.
  36. Do large language models know what they don’t know? In Findings of the Association for Computational Linguistics: ACL 2023, pages 8653–8665.
  37. R-tuning: Teaching large language models to refuse unknown questions. CoRR, abs/2311.09677.
  38. Xuanyu Zhang and Qing Yang. 2023. Self-qa: Unsupervised knowledge guided language model alignment. CoRR, abs/2305.11952.
Citations (5)

Summary

  • The paper introduces a self-alignment framework enabling LLMs to recognize and explain limitations when faced with unanswerable questions.
  • It employs a two-stage process with class-aware self-augmentation and disparity-driven self-curation to refine generated responses.
  • Experimental results show significant improvements in unknown question detection and response quality compared to existing baselines.

Self-Aligning LLMs for Unknown Questions

Introduction

The paper addresses the common issue of overconfidence in LLMs when faced with unanswerable or ill-posed questions. Typically, LLMs have a tendency to provide confident yet inaccurate responses, resulting in hallucinated content. Existing methods largely focus on employing sophisticated reasoning or knowledge-enhanced techniques to improve accuracy when definitive answers exist. However, these approaches are suboptimal when questions lack definitive answers, often failing to determine when a question is inherently unanswerable.

Self-Alignment Method

The authors propose a scalable self-alignment approach that endows LLMs with the ability to recognize and explain their limitations when dealing with unknown questions. This method involves a two-stage process of class-aware self-augmentation followed by disparity-driven self-curation to fine-tune the LLM responses. Figure 1

Figure 1: The workflow of the Self-Aligned method.

Class-Aware Self-Augmentation

Initially, the method employs a base LLM to generate unknown question-response data. This synthesis involves leveraging small seed data of known-unknown question pairs to instruct the model to rewrite known questions into different types of unknown categories.

  1. Guided Question Rewriting: The model uses a few-shot learning technique with seed data as examples to convert known questions into unknown/questionable formats that reflect various types of unanswerability.
  2. Conditioned Response Generation: The LLM then generates responses conditioned on an understanding of the question's unanswerability, providing explanations rather than incomplete answers.

Disparity-Driven Self-Curation

To improve the quality of the augmented data, the self-curation step assesses the disparity between the generated unknown question-answer pairs and their original known question-answer counterparts. This process filters out noisy or low-quality samples, ensuring only high-quality data is used for further model training.

Experimental Validation

The self-alignment method was rigorously tested on datasets specially curated for unknown questions. Experimental results indicate that the proposed method surpasses existing baselines significantly. The improvements are evident in key tasks such as unknown question detection, classification, and open-ended response generation, as shown by F1 scores and human evaluation metrics.

(Figure 2 and Figure 3)

Figure 2: Effect of self-curation approaches.

Figure 3: Effect of iterative self-alignment.

Discussion

The work establishes that enhancing LLMs' capability to recognize and appropriately handle their knowledge limitations leads to more reliable AI systems. Iterative self-alignment was identified as a process that further refines the model's effectiveness over successive cycles, a finding that suggests room for continued performance enhancements.

The case studies demonstrate that the Self-Aligned method enables LLMs to not only identify when a question is unanswerable but also articulate reasonable and coherent explanations, which is an advancement over merely refusing to answer. Figure 4

Figure 4: Case study. The left one is an ambiguous question, while the right one is an incorrect question, illustrating hallucinated content and helpful explanations.

Conclusion

The research expands the understanding of how LLMs can self-align to improve responses to unknown questions, offering a significant advancement in generating trustworthy and accurate AI responses. The implications extend to various applications in conversational AI and autonomous systems where recognizing unknowns is crucial. Future developments may focus on enhancing the self-curation processes and exploring applications in interactive systems that require real-time learning and adaptation.

Overall, the paper proposes a robust framework that can be leveraged for other domains dealing with uncertain or incomplete information scenarios, marking a step towards more intelligent and self-aware AI systems.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 15 likes about this paper.