Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

The ART of LLM Refinement: Ask, Refine, and Trust (2311.07961v1)

Published 14 Nov 2023 in cs.CL

Abstract: In recent years, LLMs have demonstrated remarkable generative abilities, but can they judge the quality of their own generations? A popular concept, referred to as self-refinement, postulates that LLMs can detect and correct the errors in their generations when asked to do so. However, recent empirical evidence points in the opposite direction, suggesting that LLMs often struggle to accurately identify errors when reasoning is involved. To address this, we propose a reasoning with refinement objective called ART: Ask, Refine, and Trust, which asks necessary questions to decide when an LLM should refine its output, and either affirm or withhold trust in its refinement by ranking the refinement and the initial prediction. On two multistep reasoning tasks of mathematical word problems (GSM8K) and question answering (StrategyQA), ART achieves a performance gain of +5 points over self-refinement baselines, while using a much smaller model as the decision maker. We also demonstrate the benefit of using smaller models to make refinement decisions as a cost-effective alternative to fine-tuning a larger model.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Kumar Shridhar (25 papers)
  2. Koustuv Sinha (31 papers)
  3. Andrew Cohen (24 papers)
  4. Tianlu Wang (33 papers)
  5. Ping Yu (42 papers)
  6. Ram Pasunuru (4 papers)
  7. Mrinmaya Sachan (124 papers)
  8. Jason Weston (130 papers)
  9. Asli Celikyilmaz (81 papers)
Citations (18)