Making Large Language Models Better Reasoners with Alignment (2309.02144v1)

Published 5 Sep 2023 in cs.CL, cs.AI, and cs.LG

Abstract: Reasoning is a cognitive process of using evidence to reach a sound conclusion. The reasoning capability is essential for LLMs to serve as the brain of the artificial general intelligence agent. Recent studies reveal that fine-tuning LLMs on data with the chain of thought (COT) reasoning process can significantly enhance their reasoning capabilities. However, we find that the fine-tuned LLMs suffer from an \textit{Assessment Misalignment} problem, i.e., they frequently assign higher scores to subpar COTs, leading to potential limitations in their reasoning abilities. To address this problem, we introduce an \textit{Alignment Fine-Tuning (AFT)} paradigm, which involves three steps: 1) fine-tuning LLMs with COT training data; 2) generating multiple COT responses for each question, and categorizing them into positive and negative ones based on whether they achieve the correct answer; 3) calibrating the scores of positive and negative responses given by LLMs with a novel constraint alignment loss. Specifically, the constraint alignment loss has two objectives: a) Alignment, which guarantees that positive scores surpass negative scores to encourage answers with high-quality COTs; b) Constraint, which keeps the negative scores confined to a reasonable range to prevent the model degradation. Beyond just the binary positive and negative feedback, the constraint alignment loss can be seamlessly adapted to the ranking situations when ranking feedback is accessible. Furthermore, we also delve deeply into recent ranking-based alignment methods, such as DPO, RRHF, and PRO, and discover that the constraint, which has been overlooked by these approaches, is also crucial for their performance. Extensive experiments on four reasoning benchmarks with both binary and ranking feedback demonstrate the effectiveness of AFT.

PDF Abstract

Making LLMs Better Reasoners with Alignment

The paper "Making LLMs Better Reasoners with Alignment" published by Peiyi Wang et al. proposes a novel approach to enhance the reasoning abilities of LLMs such as LLama and GPT variants. It addresses the shortcomings of chain-of-thought (COT) fine-tuning paradigms, which, while effective at improving reasoning, suffer from the "Assessment Misalignment" problem. This issue arises when models assign higher scores to suboptimal reasoning paths, potentially limiting their effectiveness.

Key Concepts and Methods

One of the main contributions of the paper is the Alignment Fine-Tuning (AFT) paradigm, designed to address the Assessment Misalignment. The AFT process consists of three steps:

Fine-tuning LLMs using COT training data.
Generating multiple reasoning responses for each question and categorizing them into positive (correct) and negative (incorrect) categories.
Re-calibrating the scores of these responses with the novel Constraint Alignment (CA) loss.

The CA loss itself is articulated through two objectives:

Alignment: Ensures that positive examples receive higher scores than negative ones, refining the model's ability to distinguish quality in differing reasoning paths.
Constraint: Maintains the scores of negative responses within a reasonable range, preventing model degradation. This novel approach can be adapted to handle ranking situations when ranked feedback is available.

Numerical Results and Implications

Extensive experiments were conducted on four reasoning benchmarks including GSM8K and ECQA, assessing both binary and ranking feedback scenarios. The AFT paradigm showed a significant improvement in performance measures across all datasets compared to traditional vanilla fine-tuning methods. The numerical results point to an average accuracy improvement of about 2-3%, highlighting the robustness of the proposed approach in various testing settings.

Additionally, this paper explores existing ranking-based alignment methods such as DPO, RRHF, and PRO, exposing their limitations in failing to incorporate constraints adequately. The paper argues that aligning LLM reasoning scores with ranking feedback—previously overlooked—plays a crucial role in enhancing model assessments and avoiding degradation.

Implications for Future AI Developments

The findings from this paper have profound implications for AI and the development of models that can reason more effectively. Practically, improving reasoning capabilities without sacrificing model integrity can expand the application areas of AI—particularly in complex problem-solving and decision-making tasks. The theoretical insights regarding scoring alignment provide a framework that can inform future algorithm design, fostering innovation in machine reasoning.

The AFT method, particularly its adaptability to incorporate ranking feedback, could represent a significant advancement in the pursuit of more generalizable AI models. Researchers can build on this approach to explore how aligned reasoning capabilities affect interpretability and trust in AI outputs.

Conclusion

The "Making LLMs Better Reasoners with Alignment" paper offers a detailed examination of how alignment strategies can improve LLM reasoning performance. By addressing inherent limitations in current COT fine-tuning processes and laying out a refined adjustment approach through AFT, this paper paves the way for sophisticated and dependable reasoning systems in AI. The combination of empirical data and theoretical advancements ensures that this work will be a valuable reference point for future advancements in reasoning capabilities within AI systems.

PDF Markdown Bookmark Chat (Pro)

Authors (8)

Peiyi Wang (48 papers)
Lei Li (1293 papers)
Liang Chen (360 papers)
Feifan Song (14 papers)
Binghuai Lin (20 papers)
Yunbo Cao (43 papers)
Tianyu Liu (177 papers)
Zhifang Sui (89 papers)

Citations (54)

View on Semantic Scholar

Related Papers

Find Related Papers

YouTube

Show All Videos