Automatic Prompt Augmentation and Selection with Chain-of-Thought from Labeled Data (2302.12822v3)

Published 24 Feb 2023 in cs.CL

Abstract: Chain-of-thought (CoT) advances the reasoning abilities of LLMs and achieves superior performance in complex reasoning tasks. However, most CoT studies rely on carefully designed human-annotated rational chains to prompt LLMs, posing challenges for real-world applications where labeled data is available without rational chains. This paper proposes a new strategy, Automate-CoT (Automatic Prompt Augmentation and Selection with Chain-of-Thought), that can bypass human engineering of CoT by automatically augmenting rational chains from a small labeled dataset, and then pruning low-quality chains to construct a candidate pool of machine-generated rationale chains based on the labels. Finally, it selects the optimal combination of several rationale chains from the pool for CoT prompting by employing a variance-reduced policy gradient strategy to estimate the significance of each example. Automate-CoT enables a quick adaptation of the CoT technique to different tasks. Experimental results demonstrate the effectiveness of our method, where competitive results are achieved on arithmetic reasoning (+2.7%), commonsense reasoning (+3.4%), symbolic reasoning (+3.2%), and non-reasoning tasks (+2.5%). The code is available at https://github.com/SHUMKASHUN/Automate-CoT.

PDF Abstract

Automatic Prompt Augmentation and Selection with Chain-of-Thought

The paper "Automatic Prompt Augmentation and Selection with Chain-of-Thought from Labeled Data" proposes an innovative methodology, termed Automate-CoT, designed to enhance the reasoning capabilities of LLMs by mitigating the dependency on manually engineered prompts. The authors identify that existing chain-of-thought (CoT) prompting methods rely heavily on human-annotated rational chains, which can pose significant challenges in scaling to diverse real-world applications. Automate-CoT addresses this limitation by introducing a framework that generates, prunes, and selects optimal rationale chains automatically.

Methodology and Key Contributions

Automate-CoT operates through three primary stages: augmentation, pruning, and selection:

Augment: The process begins by generating multiple pseudo-rationale chains using a small labeled dataset without relying on any human-designed exemplars. This step expands the set of potential reasoning examples beyond what is available from manual inputs.
Prune: Leveraging the principle that generating correct reasoning is inherently linked to deriving correct answers, the method prunes away low-quality rationale chains based on the consistency between generated outputs and ground truth. This assumption allows for a filtered pool of rationale exemplars that are more likely to enhance LLM performance.
Select: In the final step, a variance-reduced policy gradient strategy is employed to determine the most effective combination of rationale chains. This strategy systematically optimizes the selection process, ensuring that the most beneficial chains are used for CoT prompting.

Experimental Results

Empirical evaluations demonstrate that Automate-CoT yields substantial improvements across several tasks:

Arithmetic Reasoning: +2.7%
Commonsense Reasoning: +3.4%
Symbolic Reasoning: +3.2%
Non-reasoning Tasks: +2.5%

These improvements highlight the efficacy of the automating process over traditional human-centric exemplar construction, leading to more efficient and adaptable LLM capabilities.

Implications and Future Directions

The introduction of Automate-CoT holds significant implications for both theoretical advancements and practical applications in AI. Theoretically, the approach promotes an understanding of how automated reasoning processes can be leveraged to enhance machine learning models. Practically, it alleviates the burdensome task of hand-crafting prompts, allowing for broader application across varying datasets and linguistic tasks with minimal human intervention.

Looking forward, Automate-CoT may pave the way for fully autonomous LLMs capable of adapting reasoned processes through unsupervised or semi-supervised methods. Future developments might focus on refining the exemplar generation process or extending the framework to incorporate multilingual and multimodal contexts, enhancing the adaptability and scope of LLM applications further. Additionally, integrating this approach with other advanced strategies like self-consistency and leveraging external datasets could amplify its effectiveness, making LLMs more robust and versatile in handling complex reasoning tasks.

PDF Markdown Bookmark Chat (Pro)

Authors (3)

Shizhe Diao (47 papers)
Tong Zhang (569 papers)
Kashun Shum (7 papers)

Citations (107)

View on Semantic Scholar

Automatic Prompt Augmentation and Selection with Chain-of-Thought from Labeled Data (2302.12822v3)

Automatic Prompt Augmentation and Selection with Chain-of-Thought

Methodology and Key Contributions

Experimental Results

Implications and Future Directions

Related Papers

GitHub

YouTube