The Importance of Directional Feedback for LLM-based Optimizers (2405.16434v2)
Abstract: We study the potential of using LLMs as an interactive optimizer for solving maximization problems in a text space using natural language and numerical feedback. Inspired by the classical optimization literature, we classify the natural language feedback into directional and non-directional, where the former is a generalization of the first-order feedback to the natural language space. We find that LLMs are especially capable of optimization when they are provided with {directional feedback}. Based on this insight, we design a new LLM-based optimizer that synthesizes directional feedback from the historical optimization trace to achieve reliable improvement over iterations. Empirically, we show our LLM-based optimizer is more stable and efficient in solving optimization problems, from maximizing mathematical functions to optimizing prompts for writing poems, compared with existing techniques.
- Convex optimization. Cambridge university press, 2004.
- LLF-Bench: Benchmark for interactive learning from language feedback, 2023.
- A real-world webagent with planning, long context understanding, and program synthesis. arXiv preprint arXiv:2307.12856, 2023.
- Do as i can, not as i say: Grounding language in robotic affordances. In CORL, 2023.
- Large language models are zero-shot reasoners. In NeurIPS, 2022.
- Melanie Mitchell. An introduction to genetic algorithms. MIT press, 1998.
- Jonas Mockus. The application of bayesian methods for seeking the extremum. Towards global optimization, 2:117, 1998.
- Automatic prompt optimization with "gradient descent" and beam search. arXiv preprint arXiv:2305.03495, 2023.
- Reflexion: Language agents with verbal reinforcement learning. arXiv preprint arXiv:2303.11366, 2023.
- Policy gradient methods for reinforcement learning with function approximation. In NeurIPS, 1999.
- Voyager: An open-ended embodied agent with large language models. arXiv preprint arXiv:2305.16291, 2023.
- Large language models as optimizers. arXiv preprint arXiv:2309.03409, 2023.
- React: Synergizing reasoning and acting in language models. In ICLR, 2023.
- Allen Nie (25 papers)
- Ching-An Cheng (48 papers)
- Andrey Kolobov (25 papers)
- Adith Swaminathan (28 papers)