Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Evoke: Evoking Critical Thinking Abilities in LLMs via Reviewer-Author Prompt Editing (2310.13855v1)

Published 20 Oct 2023 in cs.CL and cs.AI

Abstract: LLMs have made impressive progress in natural language processing. These models rely on proper human instructions (or prompts) to generate suitable responses. However, the potential of LLMs are not fully harnessed by commonly-used prompting methods: many human-in-the-loop algorithms employ ad-hoc procedures for prompt selection; while auto prompt generation approaches are essentially searching all possible prompts randomly and inefficiently. We propose Evoke, an automatic prompt refinement framework. In Evoke, there are two instances of a same LLM: one as a reviewer (LLM-Reviewer), it scores the current prompt; the other as an author (LLM-Author), it edits the prompt by considering the edit history and the reviewer's feedback. Such an author-reviewer feedback loop ensures that the prompt is refined in each iteration. We further aggregate a data selection approach to Evoke, where only the hard samples are exposed to the LLM. The hard samples are more important because the LLM can develop deeper understanding of the tasks out of them, while the model may already know how to solve the easier cases. Experimental results show that Evoke significantly outperforms existing methods. For instance, in the challenging task of logical fallacy detection, Evoke scores above 80, while all other baseline methods struggle to reach 20.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Xinyu Hu (32 papers)
  2. Pengfei Tang (13 papers)
  3. Simiao Zuo (25 papers)
  4. Zihan Wang (181 papers)
  5. Bowen Song (28 papers)
  6. Qiang Lou (4 papers)
  7. Jian Jiao (44 papers)
  8. Denis Charles (17 papers)
Citations (5)