Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Automatic Prompt Selection for Large Language Models (2404.02717v1)

Published 3 Apr 2024 in cs.CL and cs.LG

Abstract: LLMs can perform various natural language processing tasks with suitable instruction prompts. However, designing effective prompts manually is challenging and time-consuming. Existing methods for automatic prompt optimization either lack flexibility or efficiency. In this paper, we propose an effective approach to automatically select the optimal prompt for a given input from a finite set of synthetic candidate prompts. Our approach consists of three steps: (1) clustering the training data and generating candidate prompts for each cluster using an LLM-based prompt generator; (2) synthesizing a dataset of input-prompt-output tuples for training a prompt evaluator to rank the prompts based on their relevance to the input; (3) using the prompt evaluator to select the best prompt for a new input at test time. Our approach balances prompt generality-specificity and eliminates the need for resource-intensive training and inference. It demonstrates competitive performance on zero-shot question-answering datasets: GSM8K, MultiArith, and AQuA.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (30)
  1. Ralph Allan Bradley and Milton E Terry. 1952. Rank analysis of incomplete block designs: I. the method of paired comparisons. Biometrika, 39(3/4):324–345.
  2. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
  3. Deep reinforcement learning from human preferences. Advances in neural information processing systems, 30.
  4. Rlprompt: Optimizing discrete text prompts with reinforcement learning. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 3369–3391.
  5. Making pre-trained language models better few-shot learners. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 3816–3830.
  6. Mohammad Bavarian Mark Chen Heewoo Jun Lukasz Kaiser Matthias Plappert Jerry Tworek Jacob Hilton Reiichiro Nakano Karl Cobbe, Vineet Kosaraju et al. 2021. Training verifiers to solve math word problems. arXiv preprint arXiv:2110.14168.
  7. Zero-label prompt selection. arXiv preprint arXiv:2211.04668.
  8. Fantastically ordered prompts and where to find them: Overcoming few-shot prompt order sensitivity. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8086–8098.
  9. Llm-rec: Personalized recommendation via prompting large language models. arXiv preprint arXiv:2307.15780.
  10. James MacQueen et al. 1967. Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, volume 1, pages 281–297. Oakland, CA, USA.
  11. When giant language brains just aren’t enough! domain pizzazz with knowledge sparkle dust. arXiv preprint arXiv:2305.07230.
  12. SharPT: Shared latent space prompt tuning. In Findings of the Association for Computational Linguistics: EACL 2023, pages 1244–1250, Dubrovnik, Croatia. Association for Computational Linguistics.
  13. Grips: Gradient-free, edit-based instruction search for prompting large language models. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 3827–3846.
  14. Guanghui Qin and Jason Eisner. 2021. Learning how to ask: Querying lms with mixtures of soft prompts. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 5203–5212.
  15. Nils Reimers and Iryna Gurevych. 2019. Sentence-bert: Sentence embeddings using siamese bert-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics.
  16. Laria Reynolds and Kyle McDonell. 2021. Prompt programming for large language models: Beyond the few-shot paradigm. In Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems, pages 1–7.
  17. Aida Amini Nate Kushman Rik Koncel-Kedziorski, Subhro Roy and Hannaneh Hajishirzi. 2016. Mawps: A math word problem repository. In Proceedings of the 2016 conference of the north american chapter of the association for computational linguistics: human language technologies), pages 1152––1157.
  18. Subhro Roy and Dan Roth. 2016. Solving general arithmetic word problems. arXiv preprint arXiv:1608.01413.
  19. Explaining patterns in data with language models via interpretable autoprompting. arXiv preprint arXiv:2210.01848.
  20. An information-theoretic approach to prompt engineering without ground truth labels. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 819–862.
  21. Beyond the imitation game: Quantifying and extrapolating the capabilities of language models. Transactions on Machine Learning Research.
  22. Naman Goyal Mikel Artetxe Moya Chen Shuohui Chen Christopher Dewan Mona Diab Xian Li Xi Victoria Lin Todor Mihaylov Myle Ott Sam Shleifer Kurt Shuster Daniel Simig Punit Singh Koura Anjali Sridhar Tianlu Wang Luke Zettlemoyer Susan Zhang, Stephen Roller. 2022. Opt: Open pre-trained transformer language models. arXiv preprint arXiv:1608.01413.
  23. Chris Dyer Wang Ling, Dani Yogatama and Phil Blunsom. 2017. Program induction by rationale generation: Learning to solve and explain algebraic word problems. arXiv preprint arXiv:1705.04146.
  24. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35:24824–24837.
  25. A prompt pattern catalog to enhance prompt engineering with chatgpt. arXiv preprint arXiv:2302.11382.
  26. Large language models as optimizers. arXiv preprint arXiv:2309.03409.
  27. React: Synergizing reasoning and acting in language models. In The Eleventh International Conference on Learning Representations.
  28. Tempera: Test-time prompt editing via reinforcement learning. In The Eleventh International Conference on Learning Representations.
  29. Automatic chain of thought prompting in large language models. In The Eleventh International Conference on Learning Representations.
  30. Large language models are human-level prompt engineers. In The Eleventh International Conference on Learning Representations.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Viet-Tung Do (1 paper)
  2. Van-Khanh Hoang (1 paper)
  3. Duy-Hung Nguyen (6 papers)
  4. Shahab Sabahi (3 papers)
  5. Jeff Yang (3 papers)
  6. Hajime Hotta (3 papers)
  7. Minh-Tien Nguyen (19 papers)
  8. Hung Le (120 papers)
Citations (2)