Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Learning to Rewrite Prompts for Personalized Text Generation (2310.00152v2)

Published 29 Sep 2023 in cs.CL

Abstract: Facilitated by LLMs, personalized text generation has become a rapidly growing research direction. Most existing studies focus on designing specialized models for a particular domain, or they require fine-tuning the LLMs to generate personalized text. We consider a typical scenario in which the LLM, which generates personalized output, is frozen and can only be accessed through APIs. Under this constraint, all one can do is to improve the input text (i.e., text prompts) sent to the LLM, a procedure that is usually done manually. In this paper, we propose a novel method to automatically revise prompts for personalized text generation. The proposed method takes the initial prompts generated by a state-of-the-art, multistage framework for personalized generation and rewrites a few critical components that summarize and synthesize the personal context. The prompt rewriter employs a training paradigm that chains together supervised learning (SL) and reinforcement learning (RL), where SL reduces the search space of RL and RL facilitates end-to-end training of the rewriter. Using datasets from three representative domains, we demonstrate that the rewritten prompts outperform both the original prompts and the prompts optimized via supervised learning or reinforcement learning alone. In-depth analysis of the rewritten prompts shows that they are not only human readable, but also able to guide manual revision of prompts when there is limited resource to employ reinforcement learning to train the prompt rewriter, or when it is costly to deploy an automatic prompt rewriter for inference.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (32)
  1. Palm 2 technical report. arXiv preprint arXiv:2305.10403 (2023).
  2. RLPrompt: Optimizing Discrete Text Prompts with Reinforcement Learning. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. 3369–3391.
  3. Markus Freitag and Yaser Al-Onaizan. 2017. Beam Search Strategies for Neural Machine Translation. ACL 2017 (2017), 56.
  4. Making pre-trained language models better few-shot learners. In Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL-IJCNLP 2021. Association for Computational Linguistics (ACL), 3816–3830.
  5. Research on user granularity-level personalized social text generation technology. Journal of Computer Applications ([n. d.]), 0.
  6. Text Generation with Efficient (Soft) Q𝑄Qitalic_Q-Learning. (2021).
  7. BERTese: Learning to Speak to BERT. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. 3618–3623.
  8. How can we know what language models know? Transactions of the Association for Computational Linguistics 8 (2020), 423–438.
  9. The Power of Scale for Parameter-Efficient Prompt Tuning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 3045–3059.
  10. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems 33 (2020), 9459–9474.
  11. Teach LLMs to Personalize–An Approach inspired by Writing Education. arXiv preprint arXiv:2308.07968 (2023).
  12. Knowledge-enhanced personalized review generation with capsule graph neural network. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 735–744.
  13. Pan Li and Alexander Tuzhilin. 2019. Towards Controllable and Personalized Review Generation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 3237–3245.
  14. Xiang Lisa Li and Percy Liang. 2021. Prefix-Tuning: Optimizing Continuous Prompts for Generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 4582–4597.
  15. Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out. 74–81.
  16. Training Millions of Personalized Dialogue Agents. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2775–2779.
  17. Justifying recommendations using distantly-labeled reviews and fine-grained aspects. In Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP). 188–197.
  18. Avocado research email collection. Philadelphia: Linguistic Data Consortium (2015).
  19. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics. 311–318.
  20. Pchatbot: A large-scale dataset for personalized chatbot. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2470–2477.
  21. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research 21, 1 (2020), 5485–5551.
  22. LaMP: When Large Language Models Meet Personalization. arXiv preprint arXiv:2304.11406 (2023).
  23. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017).
  24. Noam Shazeer and Mitchell Stern. 2018. Adafactor: Adaptive learning rates with sublinear memory cost. In International Conference on Machine Learning. PMLR, 4596–4604.
  25. AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 4222–4235.
  26. Personalizing Large Language Models. ([n. d.]).
  27. Stuck_In_the_Matrix. 2015. Reddit Public Comments (2007-10 through 2015-05). (2015). https://www.reddit.com/r/datasets/comments/3bxlg7/i_have_every_publicly_available_reddit_comment/
  28. Personalized response generation via generative split memory network. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1956–1970.
  29. Personalizing Dialogue Agents: I have a dog, do you have pets too?. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2204–2213.
  30. Tempera: Test-time prompt editing via reinforcement learning. In The Eleventh International Conference on Learning Representations.
  31. Less is More: Learning to Refine Dialogue History for Personalized Dialogue Generation. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 5808–5820.
  32. Large language models are human-level prompt engineers. arXiv preprint arXiv:2211.01910 (2022).
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Cheng Li (1094 papers)
  2. Mingyang Zhang (56 papers)
  3. Qiaozhu Mei (68 papers)
  4. Weize Kong (7 papers)
  5. Michael Bendersky (63 papers)
Citations (8)
X Twitter Logo Streamline Icon: https://streamlinehq.com