Confidence Matters: Revisiting Intrinsic Self-Correction Capabilities of Large Language Models (2402.12563v3)
Abstract: The recent success of LLMs has catalyzed an increasing interest in their self-correction capabilities. This paper presents a comprehensive investigation into the intrinsic self-correction of LLMs, attempting to address the ongoing debate about its feasibility. Our research has identified an important latent factor - the "confidence" of LLMs - during the self-correction process. Overlooking this factor may cause the models to over-criticize themselves, resulting in unreliable conclusions regarding the efficacy of self-correction. We have experimentally observed that LLMs possess the capability to understand the "confidence" in their own responses. It motivates us to develop an "If-or-Else" (IoE) prompting framework, designed to guide LLMs in assessing their own "confidence", facilitating intrinsic self-corrections. We conduct extensive experiments and demonstrate that our IoE-based Prompt can achieve a consistent improvement regarding the accuracy of self-corrected responses over the initial answers. Our study not only sheds light on the underlying factors affecting self-correction in LLMs, but also introduces a practical framework that utilizes the IoE prompting principle to efficiently improve self-correction capabilities with "confidence". The code is available at https://github.com/MBZUAI-CLeaR/IoE-Prompting.git.
- GPT-4 technical report. arXiv preprint arXiv:2303.08774.
- Rl4f: Generating natural language feedback with reinforcement learning for repairing model outputs. arXiv preprint arXiv:2305.08844.
- Training a helpful and harmless assistant with reinforcement learning from human feedback. arXiv preprint arXiv:2204.05862.
- Constitutional ai: Harmlessness from ai feedback. arXiv preprint arXiv:2212.08073.
- BIG bench authors. 2023. Beyond the imitation game: Quantifying and extrapolating the capabilities of language models. Transactions on Machine Learning Research.
- BenchLMM: Benchmarking cross-style visual capability of large multimodal models. arXiv preprint arXiv:2312.02896.
- Iterative translation refinement with large language models. arXiv preprint arXiv:2306.03856.
- Teaching large language models to self-debug. arXiv preprint arXiv:2304.05128.
- FacTool: Factuality detection in generative ai–a tool augmented framework for multi-task and multi-domain scenarios. arXiv preprint arXiv:2307.13528.
- Training verifiers to solve math word problems. arXiv preprint arXiv:2110.14168.
- Rephrase and respond: Let large language models ask better questions for themselves. arXiv preprint arXiv:2311.04205.
- Improving factuality and reasoning in language models through multiagent debate. arXiv preprint arXiv:2305.14325.
- Bridging the gap: A survey on integrating (human) feedback for natural language generation. arXiv preprint arXiv:2305.00955.
- Baldur: Whole-proof generation and repair with large language models. In Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pages 1229–1241.
- Improving language model negotiation with self-play and in-context learning from ai feedback. arXiv preprint arXiv:2305.10142.
- The capacity for moral self-correction in large language models. arXiv preprint arXiv:2302.07459.
- Rarr: Researching and revising what language models say, using language models. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 16477–16508.
- Self-verification improves few-shot clinical information extraction. arXiv preprint arXiv:2306.00024.
- Improving alignment of dialogue agents via targeted human judgements. arXiv preprint arXiv:2209.14375.
- Critic: Large language models can self-correct with tool-interactive critiquing. arXiv preprint arXiv:2305.11738.
- Llm self defense: By self examination, llms know they are being tricked. arXiv preprint arXiv:2308.07308.
- Large language models cannot self-correct reasoning yet. arXiv preprint arXiv:2310.01798.
- Mixtral of experts. arXiv preprint arXiv:2401.04088.
- Selfevolve: A code evolution framework via large language models. arXiv preprint arXiv:2306.02907.
- Instance-aware prompt learning for language understanding and generation. ACM Transactions on Asian and Low-Resource Language Information Processing.
- Language models (mostly) know what they know. arXiv preprint arXiv:2207.05221.
- Language models can solve computer tasks. arXiv preprint arXiv:2303.17491.
- Coderl: Mastering code generation through pretrained models and deep reinforcement learning. Advances in Neural Information Processing Systems, 35:21314–21328.
- Self-checker: Plug-and-play modules for fact-checking with large language models. arXiv preprint arXiv:2305.14623.
- Prd: Peer rank and discussion improve large language model based evaluations. arXiv preprint arXiv:2307.02762.
- Self-refine: Iterative refinement with self-feedback. arXiv preprint arXiv:2303.17651.
- Selfcheckgpt: Zero-resource black-box hallucination detection for generative large language models. arXiv preprint arXiv:2303.08896.
- Flirt: Feedback loop in-context red teaming. arXiv preprint arXiv:2308.04265.
- Demystifying gpt self-repair for code generation. arXiv preprint arXiv:2306.09896.
- Is self-repair a silver bullet for code generation. In arXiv preprint arXiv:2306.09896.
- Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744.
- Logic-lm: Empowering large language models with symbolic solvers for faithful logical reasoning. arXiv preprint arXiv:2305.12295.
- Automatically correcting large language models: Surveying the landscape of diverse self-correction strategies. arXiv preprint arXiv:2308.03188.
- Are nlp models really able to solve simple math word problems? arXiv preprint arXiv:2103.07191.
- Refiner: Reasoning feedback on intermediate representations. arXiv preprint arXiv:2304.01904.
- Check your facts and try again: Improving large language models with external knowledge and automated feedback. arXiv preprint arXiv:2302.12813.
- Controllable natural language generation with contrastive prefixes. arXiv preprint arXiv:2202.13257.
- Learning representations by back-propagating errors. nature, 323(6088):533–536.
- Self-critiquing models for assisting human evaluators. arXiv preprint arXiv:2206.05802.
- Reflexion: Language agents with verbal reinforcement learning. In Thirty-seventh Conference on Neural Information Processing Systems.
- GPT-4 doesn’t know it’s wrong: An analysis of iterative prompting for reasoning problems. arXiv preprint arXiv:2310.12397.
- Llms cannot find reasoning errors, but can correct them! arXiv preprint arXiv:2311.08516.
- Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35:24824–24837.
- A survey of joint intent detection and slot filling models in natural language understanding. ACM Computing Surveys, 55(8):1–38.
- Generating sequences by learning to self-correct. arXiv preprint arXiv:2211.00053.
- Natural language processing for smart construction: Current status and future directions. Automation in Construction, 134:104059.
- Fine-grained human feedback gives better rewards for language model training. arXiv preprint arXiv:2306.01693.
- Learning to simulate natural language feedback for interactive semantic parsing. arXiv preprint arXiv:2305.08195.
- Re3: Generating longer stories with recursive reprompting and revision. arXiv preprint arXiv:2210.06774.
- HotpotQA: A dataset for diverse, explainable multi-hop question answering. arXiv preprint arXiv:1809.09600.
- React: Synergizing reasoning and acting in language models. arXiv preprint arXiv:2210.03629.
- Selfee: Iterative self-revising llm empowered by self-feedback generation. Blog post.
- Improving language models via plug-and-play retrieval feedback. arXiv preprint arXiv:2305.14002.
- Teaching language models to self-improve through interactive demonstrations. arXiv preprint arXiv:2310.13522.
- Self-edit: Fault-aware code editor for code generation. arXiv preprint arXiv:2305.04087.
- Algo: Synthesizing algorithmic programs with generated oracle verifiers. arXiv preprint arXiv:2305.14591.
- Loka Li (6 papers)
- Guangyi Chen (45 papers)
- Yusheng Su (21 papers)
- Zhenhao Chen (12 papers)
- Yixuan Zhang (94 papers)
- Eric Xing (127 papers)
- Kun Zhang (353 papers)