Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Optimization Methods for Personalizing Large Language Models through Retrieval Augmentation (2404.05970v1)

Published 9 Apr 2024 in cs.CL and cs.IR

Abstract: This paper studies retrieval-augmented approaches for personalizing LLMs, which potentially have a substantial impact on various applications and domains. We propose the first attempt to optimize the retrieval models that deliver a limited number of personal documents to LLMs for the purpose of personalized generation. We develop two optimization algorithms that solicit feedback from the downstream personalized generation tasks for retrieval optimization -- one based on reinforcement learning whose reward function is defined using any arbitrary metric for personalized generation and another based on knowledge distillation from the downstream LLM to the retrieval model. This paper also introduces a pre- and post-generation retriever selection model that decides what retriever to choose for each LLM input. Extensive experiments on diverse tasks from the LLM personalization (LaMP) benchmark reveal statistically significant improvements in six out of seven datasets.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (61)
  1. PENS: A Dataset and Generic Framework for Personalized News Headline Generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Online, 82–92. https://doi.org/10.18653/v1/2021.acl-long.7
  2. BERT-QPP: Contextualized Pre-trained transformers for Query Performance Prediction. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management (Virtual Event, Queensland, Australia) (CIKM ’21). Association for Computing Machinery, New York, NY, USA, 2857–2861. https://doi.org/10.1145/3459637.3482063
  3. Predicting Efficiency/Effectiveness Trade-Offs for Dense vs. Sparse Retrieval Strategy Selection. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management (Virtual Event, Queensland, Australia) (CIKM ’21). Association for Computing Machinery, New York, NY, USA, 2862–2866. https://doi.org/10.1145/3459637.3482159
  4. Longformer: The Long-Document Transformer. arXiv:2004.05150 [cs.CL]
  5. Modeling the Impact of Short- and Long-Term Behavior on Search Personalization. In Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval (Portland, Oregon, USA) (SIGIR ’12). Association for Computing Machinery, New York, NY, USA, 185–194. https://doi.org/10.1145/2348283.2348312
  6. Learning to Rank: From Pairwise Approach to Listwise Approach. In Proceedings of the 24th International Conference on Machine Learning (Corvalis, Oregon, USA) (ICML ’07). Association for Computing Machinery, New York, NY, USA, 129–136. https://doi.org/10.1145/1273496.1273513
  7. D. Carmel and E. Yom-Tov. 2010. Estimating the Query Difficulty for Information Retrieval (1st ed.). Morgan and Claypool Publishers.
  8. Hyung Won Chung et al. 2022. Scaling Instruction-Finetuned Language Models. arXiv:2210.11416 [cs.LG]
  9. Reciprocal Rank Fusion Outperforms Condorcet and Individual Rank Learning Methods. In Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval (Boston, MA, USA) (SIGIR ’09). Association for Computing Machinery, New York, NY, USA, 758–759. https://doi.org/10.1145/1571941.1572114
  10. Relevance Feedback and Personalization: A Language Modeling Perspective. In DELOS Workshop: Personalisation and Recommender Systems in Digital Libraries. http://citeseer.ist.psu.edu/453602.html
  11. Improved query performance prediction using standard deviation. In Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval (Beijing, China) (SIGIR ’11). Association for Computing Machinery, New York, NY, USA, 1089–1090. https://doi.org/10.1145/2009916.2010063
  12. A Relative Information Gain-based Query Performance Prediction Framework with Generated Query Variants. ACM Trans. Inf. Syst. 41, 2, Article 38 (dec 2022), 31 pages. https://doi.org/10.1145/3545112
  13. Refocusing on Relevance: Personalization in NLG. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Online and Punta Cana, Dominican Republic, 5190–5202. https://doi.org/10.18653/v1/2021.emnlp-main.421
  14. Susan T. Dumais. 2016. Personalized Search: Potential and Pitfalls. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management (Indianapolis, Indiana, USA) (CIKM ’16). Association for Computing Machinery, New York, NY, USA, 689. https://doi.org/10.1145/2983323.2983367
  15. Mohamed Farah and Daniel Vanderpooten. 2007. An Outranking Approach for Rank Aggregation in Information Retrieval. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (Amsterdam, The Netherlands) (SIGIR ’07). Association for Computing Machinery, New York, NY, USA, 591–598. https://doi.org/10.1145/1277741.1277843
  16. Lucie Flek. 2020. Returning the N to NLP: Towards Contextually Personalized Classification Models. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 7828–7838. https://doi.org/10.18653/v1/2020.acl-main.700
  17. Effects of Language Modeling and Its Personalization on Touchscreen Typing Performance. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (Seoul, Republic of Korea) (CHI ’15). Association for Computing Machinery, New York, NY, USA, 649–658. https://doi.org/10.1145/2702123.2702503
  18. Markus Freitag and Yaser Al-Onaizan. 2017. Beam Search Strategies for Neural Machine Translation. In Proceedings of the First Workshop on Neural Machine Translation, Thang Luong, Alexandra Birch, Graham Neubig, and Andrew Finch (Eds.). Association for Computational Linguistics, Vancouver, 56–60. https://doi.org/10.18653/v1/W17-3207
  19. Unsupervised Dense Information Retrieval with Contrastive Learning. Transactions on Machine Learning Research (2022). https://openreview.net/forum?id=jKN1pXi7b0
  20. Gautier Izacard and Edouard Grave. 2021. Distilling Knowledge from Reader to Retriever for Question Answering. In International Conference on Learning Representations. https://openreview.net/forum?id=NTEz-6wysdb
  21. Aaron Jaech and Mari Ostendorf. 2018. Personalized Language Model for Query Auto-Completion. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, Melbourne, Australia, 700–705. https://doi.org/10.18653/v1/P18-2111
  22. Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging. ArXiv abs/2310.11564 (2023). https://api.semanticscholar.org/CorpusID:264289231
  23. Selecting which Dense Retriever to use for Zero-Shot Search. arXiv:2309.09403 [cs.IR]
  24. Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. CoRR abs/1412.6980 (2014). https://api.semanticscholar.org/CorpusID:6628106
  25. Automatic Prompt Rewriting for Personalized Text Generation. ArXiv abs/2310.00152 (2023). https://api.semanticscholar.org/CorpusID:263333908
  26. Teach LLMs to Personalize - An Approach inspired by Writing Education. ArXiv abs/2308.07968 (2023). https://api.semanticscholar.org/CorpusID:260926523
  27. Pan Li and Alexander Tuzhilin. 2019. Towards Controllable and Personalized Review Generation. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, 3237–3245. https://doi.org/10.18653/v1/D19-1319
  28. Chin-Yew Lin. 2004. ROUGE: A Package for Automatic Evaluation of Summaries. In Text Summarization Branches Out. Association for Computational Linguistics, Barcelona, Spain, 74–81. https://aclanthology.org/W04-1013
  29. A rank fusion approach based on score distributions for prioritizing relevance assessments in information retrieval evaluation. Information Fusion 39 (2018), 56–71. https://doi.org/10.1016/j.inffus.2017.04.001
  30. Generating Personalized Recipes from Historical User Preferences. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Association for Computational Linguistics, Hong Kong, China, 5976–5982. https://doi.org/10.18653/v1/D19-1613
  31. Training Millions of Personalized Dialogue Agents. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium, 2775–2779. https://doi.org/10.18653/v1/D18-1298
  32. Distributed Representations of Words and Phrases and their Compositionality. In NIPS ’13. 3111–3119.
  33. UserIdentifier: Implicit User Representations for Simple and Effective Personalized Sentiment Analysis. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Seattle, United States, 3449–3456. https://doi.org/10.18653/v1/2022.naacl-main.252
  34. Frederic Morin and Yoshua Bengio. 2005. Hierarchical Probabilistic Neural Network Language Model. In AISTATS ’05. 246–252.
  35. PEARL: Personalizing Large Language Model Writing Assistants with Generation-Calibrated Retrievers. https://api.semanticscholar.org/CorpusID:265213422
  36. Maxim Naumov et al. 2019. Deep Learning Recommendation Model for Personalization and Recommendation Systems. arXiv:1906.00091 [cs.IR]
  37. Rabia Nuray and Fazli Can. 2006. Automatic ranking of information retrieval systems using data fusion. Information Processing & Management 42, 3 (2006), 595–614. https://doi.org/10.1016/j.ipm.2005.03.023
  38. PyTorch: An Imperative Style, High-Performance Deep Learning Library. Curran Associates Inc., Red Hook, NY, USA.
  39. Pchatbot: A Large-Scale Dataset for Personalized Chatbot. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (Virtual Event, Canada) (SIGIR ’21). Association for Computing Machinery, New York, NY, USA, 2470–2477. https://doi.org/10.1145/3404835.3463239
  40. Integrating Summarization and Retrieval for Enhanced Personalization via Large Language Models. ArXiv abs/2310.20081 (2023). https://api.semanticscholar.org/CorpusID:264805263
  41. Okapi at TREC-3. In Text Retrieval Conference. https://api.semanticscholar.org/CorpusID:3946054
  42. Robust Standard Deviation Estimation for Query Performance Prediction. In Proceedings of the 2017 International ACM SIGIR Conference on the Theory of Information Retrieval (ICTIR ’17). 245–248.
  43. LaMP: When Large Language Models Meet Personalization. arXiv:2304.11406 [cs.CL]
  44. Predicting Query Performance by Query-Drift Estimation. ACM Transactions on Information Systems 30, 2 (May 2012).
  45. PERSON: Personalized information retrieval evaluation based on citation networks. Information Processing & Management 54, 4 (2018), 630–656. https://doi.org/10.1016/j.ipm.2018.04.004
  46. Personalised Language Modelling of Screen Characters Using Rich Metadata Annotations. arXiv preprint arXiv:2303.16618 (2023).
  47. Retrieve What You Need: A Mutual Learning Framework for Open-domain Question Answering. (March 2023). https://www.microsoft.com/en-us/research/publication/retrieve-what-you-need-a-mutual-learning-framework-for-open-domain-question-answering/
  48. Ronald J. Williams. 1992. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning. Mach. Learn. 8, 3–4 (may 1992), 229–256. https://doi.org/10.1007/BF00992696
  49. Personalized Response Generation via Generative Split Memory Network. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Online, 1956–1970. https://doi.org/10.18653/v1/2021.naacl-main.157
  50. Compact Personalized Models for Neural Machine Translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium, 881–886. https://doi.org/10.18653/v1/D18-1104
  51. User Language Model for Collaborative Personalized Search. ACM Trans. Inf. Syst. 27, 2, Article 11 (mar 2009), 28 pages. https://doi.org/10.1145/1462198.1462203
  52. Sohee Yang and Minjoon Seo. 2020. Is Retriever Merely an Approximator of Reader? arXiv:2010.10999 [cs.CL]
  53. Hamed Zamani and W. Bruce Croft. 2017. Relevance-based Word Embedding. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (Shinjuku, Tokyo, Japan) (SIGIR ’17). Association for Computing Machinery, New York, NY, USA, 505–514. https://doi.org/10.1145/3077136.3080831
  54. Neural Query Performance Prediction using Weak Supervision from Multiple Signals. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval (Ann Arbor, MI, USA) (SIGIR ’18). Association for Computing Machinery, New York, NY, USA, 105–114. https://doi.org/10.1145/3209978.3210041
  55. Retrieval-Enhanced Machine Learning. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (Madrid, Spain) (SIGIR ’22). Association for Computing Machinery, New York, NY, USA, 2875–2886. https://doi.org/10.1145/3477495.3531722
  56. A Personalized Dense Retrieval Framework for Unified Information Access. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (Taipei, Taiwan) (SIGIR ’23). Association for Computing Machinery, New York, NY, USA, 121–130. https://doi.org/10.1145/3539618.3591626
  57. Memory-Augmented LLM Personalization with Short- and Long-Term Memory Coordination. ArXiv abs/2309.11696 (2023). https://api.semanticscholar.org/CorpusID:262083954
  58. Personalizing Dialogue Agents: I have a dog, do you have pets too?. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Melbourne, Australia, 2204–2213. https://doi.org/10.18653/v1/P18-1205
  59. Query Specific Rank Fusion for Image Retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence 37, 4 (2015), 803–815. https://doi.org/10.1109/TPAMI.2014.2346201
  60. Less is More: Learning to Refine Dialogue History for Personalized Dialogue Generation. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, Seattle, United States, 5808–5820. https://doi.org/10.18653/v1/2022.naacl-main.426
  61. Yun Zhou and W. Bruce Croft. 2007. Query performance prediction in web search environments. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (Amsterdam, The Netherlands) (SIGIR ’07). Association for Computing Machinery, New York, NY, USA, 543–550. https://doi.org/10.1145/1277741.1277835
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Alireza Salemi (21 papers)
  2. Surya Kallumadi (15 papers)
  3. Hamed Zamani (88 papers)
Citations (21)
X Twitter Logo Streamline Icon: https://streamlinehq.com