Localized Zeroth-Order Prompt Optimization (2403.02993v1)
Abstract: The efficacy of LLMs in understanding and generating natural language has aroused a wide interest in developing prompt-based methods to harness the power of black-box LLMs. Existing methodologies usually prioritize a global optimization for finding the global optimum, which however will perform poorly in certain tasks. This thus motivates us to re-think the necessity of finding a global optimum in prompt optimization. To answer this, we conduct a thorough empirical study on prompt optimization and draw two major insights. Contrasting with the rarity of global optimum, local optima are usually prevalent and well-performed, which can be more worthwhile for efficient prompt optimization (Insight I). The choice of the input domain, covering both the generation and the representation of prompts, affects the identification of well-performing local optima (Insight II). Inspired by these insights, we propose a novel algorithm, namely localized zeroth-order prompt optimization (ZOPO), which incorporates a Neural Tangent Kernel-based derived Gaussian process into standard zeroth-order optimization for an efficient search of well-performing local optima in prompt optimization. Remarkably, ZOPO outperforms existing baselines in terms of both the optimization performance and the query efficiency, which we demonstrate through extensive experiments.
- On exact computation with an infinitely wide neural net. In NeurIPS, pp. 8139–8148, 2019.
- InstructZero: Efficient instruction optimization for black-box large language models. arXiv preprint arXiv:2306.03082, 2023.
- RLPrompt: Optimizing discrete text prompts with reinforcement learning. In Proc. EMNLP, pp. 3369–3391, 2022.
- Black-box prompt learning for pre-trained language models. Transactions on Machine Learning Research, 2023.
- Benchmarking optimization software with performance profiles. Mathematical programming, 91:201–213, 2002.
- Online convex optimization in the bandit setting: Gradient descent without a gradient. In Proc. SODA, 2005.
- Connecting large language models with evolutionary algorithms yields powerful prompt optimizers. In ICLR, 2024.
- Instruction induction: From few examples to natural language task descriptions. In Proc. ACL, pp. 1935–1952, 2023.
- Neural Tangent Kernel: Convergence and generalization in neural networks. In Proc. NeurIPS, pp. 8580–8589, 2018.
- Large language models are zero-shot reasoners. In Proc. NeurIPS, volume 35, pp. 22199–22213, 2022.
- Universal approximation theorems for differentiable geometric deep learning. The Journal of Machine Learning Research, 23(1):8896–8968, 2022.
- Wide neural networks of any depth evolve as linear models under gradient descent. In Proc. NeurIPS, pp. 8572–8583, 2019.
- Prefix-tuning: Optimizing continuous prompts for generation. Proc. ACL, pp. 4582–4597, 2021.
- Use Your INSTINCT: INSTruction optimization usIng Neural bandits Coupled with Transformers. In NeurIPS 2023 Workshop on Instruction Tuning and Instruction Following, 2023.
- Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Computing Surveys, 55(9):1–35, 2023.
- Reframing instructional prompts to gptk’s language. ACL Findings, pp. 589–612, 2021.
- High-dimensional bayesian optimization using low-dimensional feature spaces. Machine Learning, 109:1925–1943, 2020.
- Random gradient-free minimization of convex functions. Found. Comput. Math., 17(2):527–566, 2017.
- OpenAI. GPT-4 technical report. arXiv preprint arXiv:2303.08774, 2023.
- OpenAI. ChatGPT. https://openai.com/blog/chatgpt, 2024a.
- OpenAI. Documentation of OpenAI’s text embeddings. https://platform.openai.com/docs/guides/embeddings, 2024b.
- Training language models to follow instructions with human feedback. Proc. NeurIPS, pp. 27730–27744, 2022a.
- Training language models to follow instructions with human feedback. arXiv preprint arXiv:2203.02155, 2022b.
- Sentence-BERT: Sentence embeddings using siamese bert-networks. In Proc. EMNLP-IJCNLP, pp. 3982–3992, 2019.
- Prompt programming for large language models: Beyond the few-shot paradigm. In Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems, pp. 1–7, 2021.
- Optimal approximation rate of relu networks in terms of width and depth. Journal de Mathématiques Pures et Appliquées, 157:101–135, 2022.
- NASI: Label- and data-agnostic neural architecture search at initialization. In Proc. ICLR, 2022a.
- Unifying and boosting gradient-based training-free neural architecture search. In Proc. NeurIPS, pp. 33001–33015, 2022b.
- Zeroth-order optimization with trajectory-informed derivative estimation. In Proc. ICLR, 2023a.
- Federated zeroth-order optimization using trajectory-informed surrogate gradients. arXiv preprint arXiv:2308.04077, 2023b.
- Bbtv2: Pure black-box optimization can be comparable to gradient descent for few-shot learning. arXiv preprint arXiv:2205.11200, 2022a.
- Black-box tuning for language-model-as-a-service. In International Conference on Machine Learning, pp. 20841–20855. PMLR, 2022b.
- Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023.
- TEMPERA: Test-time prompt editing via reinforcement learning. In Proc. ICLR, 2023.
- Large language models are human-level prompt engineers. In ICLR, 2023.
- Wenyang Hu (9 papers)
- Yao Shu (29 papers)
- Zongmin Yu (1 paper)
- Zhaoxuan Wu (15 papers)
- Xiangqiang Lin (1 paper)
- Zhongxiang Dai (39 papers)
- See-Kiong Ng (103 papers)
- Bryan Kian Hsiang Low (77 papers)