Self-Contrast: Better Reflection Through Inconsistent Solving Perspectives (2401.02009v3)

Published 4 Jan 2024 in cs.CL and cs.AI

Abstract: The reflection capacity of LLM has garnered extensive attention. A post-hoc prompting strategy, e.g., reflexion and self-refine, refines LLM's response based on self-evaluated or external feedback. However, recent research indicates without external feedback, LLM's intrinsic reflection is unstable. Our investigation unveils that the key bottleneck is the quality of the self-evaluated feedback. We find LLMs often exhibit overconfidence or high randomness when self-evaluate, offering stubborn or inconsistent feedback, which causes poor reflection. To remedy this, we advocate Self-Contrast: It adaptively explores diverse solving perspectives tailored to the request, contrasts the differences, and summarizes these discrepancies into a checklist which could be used to re-examine and eliminate discrepancies. Our method endows LLM with diverse perspectives to alleviate stubborn biases. Moreover, their discrepancies indicate potential errors or inherent uncertainties that LLM often overlooks. Reflecting upon these can catalyze more accurate and stable reflection. Experiments conducted on a series of reasoning and translation tasks with different LLMs serve to underscore the effectiveness and generality of our strategy.

PDF HTML Abstract

Overview of Self-Contrast Strategy

LLMs have shown remarkable prowess in a range of tasks, particularly when supplemented with post-hoc prompting techniques that encourage self-reflection to refine responses. However, without external guidance, the self-reflection process has proven to be unreliable due to the inconsistent and overconfident nature of LLM-generated feedback. In light of these limitations, researchers have proposed a new approach, termed "Self-Contrast," aimed at improving the self-reflection mechanism in LLMs.

Enhancing LLM Self-Reflection

The proposed Self-Contrast method seeks to improve LLM response quality by introducing diverse solving perspectives that the model generates for a given problem. These multiple perspectives are then contrasted against each other to identify discrepancies. By summarizing these discrepancies into a checklist, LLMs gain a more refined instrument to revisit and revise their previous responses. This enables them to overcome biases and errors that could have been previously overlooked.

Methodology and Findings

The research includes systematic experiments testing the effectiveness of the Self-Contrast method, comparing its performance to traditional self-reflection strategies across reasoning and translation tasks. The findings indicate that Self-Contrast delivers significant improvements in performance and stability by directing the LLMs to produce varied responses and then using discrepancies between these responses as a catalyst for more accurate reflection.

Conclusions and Future Directions

Overall, the Self-Contrast approach significantly reduces the occurrence of invalid or toxic reflections where LLMs fail to correct their mistakes or inaccurately modify correct answers. Despite its promise, it is noted that the method's efficacy diminishes with smaller-scale models that lack strong instruction-following capabilities. Future work may explore external tools for comparing perspectives, offering a potentially more precise and flexible solution for LLM reflection improvement.

PDF Markdown Bookmark Chat (Pro)

References (94)

Authors (7)

Wenqi Zhang (41 papers)
Yongliang Shen (47 papers)
Linjuan Wu (7 papers)
Qiuying Peng (13 papers)
Jun Wang (990 papers)
Yueting Zhuang (164 papers)
Weiming Lu (54 papers)

Citations (31)

View on Semantic Scholar

Tweets

https://twitter.com/arankomatsuzaki/status/1743728362298696176

https://twitter.com/verdverm/status/1755830269221748845

https://twitter.com/knishimae0531/status/1743883131801997618

https://twitter.com/sawubonagmbh/status/1848761622925611467