Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Brief History of Prompt: Leveraging Language Models. (Through Advanced Prompting) (2310.04438v2)

Published 30 Sep 2023 in cs.CL and cs.AI

Abstract: This paper presents a comprehensive exploration of the evolution of prompt engineering and generation in the field of NLP. Starting from the early LLMs and information retrieval systems, we trace the key developments that have shaped prompt engineering over the years. The introduction of attention mechanisms in 2015 revolutionized language understanding, leading to advancements in controllability and context-awareness. Subsequent breakthroughs in reinforcement learning techniques further enhanced prompt engineering, addressing issues like exposure bias and biases in generated text. We examine the significant contributions in 2018 and 2019, focusing on fine-tuning strategies, control codes, and template-based generation. The paper also discusses the growing importance of fairness, human-AI collaboration, and low-resource adaptation. In 2020 and 2021, contextual prompting and transfer learning gained prominence, while 2022 and 2023 witnessed the emergence of advanced techniques like unsupervised pre-training and novel reward shaping. Throughout the paper, we reference specific research studies that exemplify the impact of various developments on prompt engineering. The journey of prompt engineering continues, with ethical considerations being paramount for the responsible and inclusive future of AI systems.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (27)
  1. A. Vaswani, N. M. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” in NIPS, 2017. [Online]. Available: https://api.semanticscholar.org/CorpusID:13756489
  2. J. Luketina, N. Nardelli, G. Farquhar, J. N. Foerster, J. Andreas, E. Grefenstette, S. Whiteson, and T. Rocktäschel, “A survey of reinforcement learning informed by natural language,” ArXiv, vol. abs/1906.03926, 2019. [Online]. Available: https://api.semanticscholar.org/CorpusID:182952502
  3. R. Paulus, C. Xiong, and R. Socher, “A deep reinforced model for abstractive summarization,” ArXiv, vol. abs/1705.04304, 2017. [Online]. Available: https://api.semanticscholar.org/CorpusID:21850704
  4. B. Liu, G. Tür, D. Z. Hakkani-Tür, P. Shah, and L. Heck, “Dialogue learning with human teaching and feedback in end-to-end trainable task-oriented dialogue systems,” in North American Chapter of the Association for Computational Linguistics, 2018. [Online]. Available: https://api.semanticscholar.org/CorpusID:4938015
  5. M. Ranzato, S. Chopra, M. Auli, and W. Zaremba, “Sequence level training with recurrent neural networks,” CoRR, vol. abs/1511.06732, 2015. [Online]. Available: https://api.semanticscholar.org/CorpusID:7147309
  6. S. Bengio, O. Vinyals, N. Jaitly, and N. M. Shazeer, “Scheduled sampling for sequence prediction with recurrent neural networks,” ArXiv, vol. abs/1506.03099, 2015. [Online]. Available: https://api.semanticscholar.org/CorpusID:1820089
  7. P. F. Christiano, J. Leike, T. B. Brown, M. Martic, S. Legg, and D. Amodei, “Deep reinforcement learning from human preferences,” ArXiv, vol. abs/1706.03741, 2017. [Online]. Available: https://api.semanticscholar.org/CorpusID:4787508
  8. T. Chakraborty, G. Badie, and B. Rudder, “Reducing gender bias in word embeddings,” 2016. [Online]. Available: https://api.semanticscholar.org/CorpusID:12616116
  9. B. Zhang, B. Lemoine, and M. Mitchell, “Mitigating unwanted biases with adversarial learning,” Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, 2018. [Online]. Available: https://api.semanticscholar.org/CorpusID:9424845
  10. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” ArXiv, vol. abs/1810.04805, 2019. [Online]. Available: https://api.semanticscholar.org/CorpusID:52967399
  11. Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov, “Roberta: A robustly optimized bert pretraining approach,” ArXiv, vol. abs/1907.11692, 2019. [Online]. Available: https://api.semanticscholar.org/CorpusID:198953378
  12. J. Howard and S. Ruder, “Universal language model fine-tuning for text classification,” in Annual Meeting of the Association for Computational Linguistics, 2018. [Online]. Available: https://api.semanticscholar.org/CorpusID:40100965
  13. C. Sun, X. Qiu, Y. Xu, and X. Huang, “How to fine-tune bert for text classification?” in China National Conference on Chinese Computational Linguistics, 2019. [Online]. Available: https://api.semanticscholar.org/CorpusID:153312532
  14. K. Labusch, C. Neudecker, and D. Zellhöfer, “Bert for named entity recognition in contemporary and historic german,” in Conference on Natural Language Processing, 2019. [Online]. Available: https://api.semanticscholar.org/CorpusID:208192606
  15. A. Coenen, E. Reif, A. Yuan, B. Kim, A. Pearce, F. B. Viégas, and M. Wattenberg, “Visualizing and measuring the geometry of bert,” ArXiv, vol. abs/1906.02715, 2019. [Online]. Available: https://api.semanticscholar.org/CorpusID:174802633
  16. M. E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, and L. Zettlemoyer, “Deep contextualized word representations,” in North American Chapter of the Association for Computational Linguistics, 2018. [Online]. Available: https://api.semanticscholar.org/CorpusID:3626819
  17. T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. J. Henighan, R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, and D. Amodei, “Language models are few-shot learners,” ArXiv, vol. abs/2005.14165, 2020. [Online]. Available: https://api.semanticscholar.org/CorpusID:218971783
  18. J. Kaplan, S. McCandlish, T. J. Henighan, T. B. Brown, B. Chess, R. Child, S. Gray, A. Radford, J. Wu, and D. Amodei, “Scaling laws for neural language models,” ArXiv, vol. abs/2001.08361, 2020. [Online]. Available: https://api.semanticscholar.org/CorpusID:210861095
  19. A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, and I. Sutskever, “Language models are unsupervised multitask learners,” 2019. [Online]. Available: https://api.semanticscholar.org/CorpusID:160025533
  20. N. S. Keskar, B. McCann, L. R. Varshney, C. Xiong, and R. Socher, “Ctrl: A conditional transformer language model for controllable generation,” ArXiv, vol. abs/1909.05858, 2019. [Online]. Available: https://api.semanticscholar.org/CorpusID:202573071
  21. D. M. Ziegler, N. Stiennon, J. Wu, T. B. Brown, A. Radford, D. Amodei, P. Christiano, and G. Irving, “Fine-tuning language models from human preferences,” ArXiv, vol. abs/1909.08593, 2019. [Online]. Available: https://api.semanticscholar.org/CorpusID:202660943
  22. M. T. Ribeiro, T. S. Wu, C. Guestrin, and S. Singh, “Beyond accuracy: Behavioral testing of nlp models with checklist,” ArXiv, vol. abs/2005.04118, 2020. [Online]. Available: https://api.semanticscholar.org/CorpusID:218551201
  23. E. Fleisig and C. D. Fellbaum, “Mitigating gender bias in machine translation through adversarial learning,” ArXiv, vol. abs/2203.10675, 2022. [Online]. Available: https://api.semanticscholar.org/CorpusID:247594904
  24. X. Han, T. Baldwin, and T. Cohn, “Towards equal opportunity fairness through adversarial learning,” ArXiv, vol. abs/2203.06317, 2022. [Online]. Available: https://api.semanticscholar.org/CorpusID:247447211
  25. S. Jain and B. C. Wallace, “Attention is not explanation,” in North American Chapter of the Association for Computational Linguistics, 2019. [Online]. Available: https://api.semanticscholar.org/CorpusID:67855860
  26. A. Rahimi, Y. Li, and T. Cohn, “Massively multilingual transfer for ner,” in Annual Meeting of the Association for Computational Linguistics, 2019. [Online]. Available: https://api.semanticscholar.org/CorpusID:153313061
  27. J. Hu, S. Ruder, A. Siddhant, G. Neubig, O. Firat, and M. Johnson, “Xtreme: A massively multilingual multi-task benchmark for evaluating cross-lingual generalization,” ArXiv, vol. abs/2003.11080, 2020. [Online]. Available: https://api.semanticscholar.org/CorpusID:214641214
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (1)
  1. Golam Md Muktadir (2 papers)
Citations (4)
X Twitter Logo Streamline Icon: https://streamlinehq.com