Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers (2309.08532v2)

Published 15 Sep 2023 in cs.CL and cs.AI

Abstract: LLMs excel in various tasks, but they rely on carefully crafted prompts that often demand substantial human effort. To automate this process, in this paper, we propose a novel framework for discrete prompt optimization, called EvoPrompt, which borrows the idea of evolutionary algorithms (EAs) as they exhibit good performance and fast convergence. To enable EAs to work on discrete prompts, which are natural language expressions that need to be coherent and human-readable, we connect LLMs with EAs. This approach allows us to simultaneously leverage the powerful language processing capabilities of LLMs and the efficient optimization performance of EAs. Specifically, abstaining from any gradients or parameters, EvoPrompt starts from a population of prompts and iteratively generates new prompts with LLMs based on the evolutionary operators, improving the population based on the development set. We optimize prompts for both closed- and open-source LLMs including GPT-3.5 and Alpaca, on 31 datasets covering language understanding, generation tasks, as well as BIG-Bench Hard (BBH) tasks. EvoPrompt significantly outperforms human-engineered prompts and existing methods for automatic prompt generation (e.g., up to 25% on BBH). Furthermore, EvoPrompt demonstrates that connecting LLMs with EAs creates synergies, which could inspire further research on the combination of LLMs and conventional algorithms.

PDF Abstract

Summary of "Connecting LLMs with Evolutionary Algorithms Yields Powerful Prompt Optimizers"

The paper at hand presents an innovative approach for discrete prompt optimization by leveraging a synergy between LLMs and Evolutionary Algorithms (EAs). This framework aims to automate the prompt design process for LLMs, which traditionally requires significant human effort. The authors introduce a discrete prompt tuning framework that employs evolutionary strategies to optimize prompts for both closed- and open-source LLMs, including GPT-3.5 and Alpaca. The framework, hereafter referred to by the placeholder '{}', capitalizes on LLMs' linguistic capabilities and EAs' optimization efficiency. This integration is achieved without the need for accessing the model parameters or gradients.

Methodology and Framework

The framework initiated by the authors builds upon the principles of EAs, specifically Genetic Algorithm (GA) and Differential Evolution (DE), and adapts them for the discrete prompt space. The key innovation lies in using LLMs as evolutionary operators, enabling prompt mutations and crossovers while maintaining linguistic coherence. The framework consists of three main steps:

Initial Population: It starts with a set of human-crafted prompts and randomly generated prompts to ensure diversity and leverage prior knowledge.
Evolution: Using LLMs, new prompt variants are generated through mutation and crossover operations akin to genetic processes, selected based on their performance on specific tasks.
Update and Iteration: The population of prompts is iteratively refined by retaining those with higher performance scores.

By instantiating {} with GA and DE, the framework demonstrates its flexibility and capability to adapt traditional evolutionary strategies for natural language tasks.

Results and Analysis

Extensive experiments were conducted across 31 datasets, covering diverse language understanding and generation tasks, as well as the challenging BIG-Bench Hard (BBH) tasks. The results are noteworthy:

Language Understanding: The framework outperformed existing methods, including manually crafted prompts and prior automatic prompt generation techniques. For instance, on sentiment classification and topic classification datasets, significant accuracy improvements were observed.
Language Generation: For tasks such as text summarization and simplification, the framework not only achieved superior ROUGE and SARI scores but also demonstrated that DE generally yields better prompts compared to GA in complex generation tasks.
BBH Tasks: The approach achieved up to a 25% improvement in some BBH tasks, suggesting its potential for reasoning-heavy tasks. DE showed particular strength in navigating the complexity and achieving higher accuracy gains.

The paper provides a detailed comparative analysis that highlights the adaptive nature of DE, especially in avoiding local optima, while GA exhibits robustness with top-performing initial prompts.

Implications and Future Directions

The approach proposed by the authors opens up new avenues for integrating traditional optimization algorithms with modern AI models. Practically, this could reduce the labor-intensive process of prompt engineering, thus democratizing the use of LLMs. Theoretically, this work can lead to further exploration of hybrid models combining different AI paradigms, such as PSO or Ant Colony Optimization, with LLMs. Moreover, it suggests a path forward for applying this synergy in diverse domains including game design and text-to-image tasks.

Ultimately, this paper provides a solid foundation for future research at the intersection of computational linguistics and optimization algorithms, demonstrating concrete improvements in LLM performance on varied tasks through the application of evolutionary techniques. The distribution of the optimized prompts for common NLP tasks further contributes to the field by providing resources for continued investigation and application.