Personalized LLM Response Generation with Parameterized Memory Injection (2404.03565v2)
Abstract: LLMs have exhibited remarkable proficiency in comprehending and generating natural language. On the other hand, personalized LLM response generation holds the potential to offer substantial benefits for individuals in critical areas such as medical. Existing research has explored memory-augmented methods to prompt the LLM with pre-stored user-specific knowledge for personalized response generation in terms of new queries. We contend that such paradigm is unable to perceive fine-granularity information. In this study, we propose a novel \textbf{M}emory-\textbf{i}njected approach using parameter-efficient fine-tuning (PEFT) and along with a Bayesian Optimisation searching strategy to achieve \textbf{L}LM \textbf{P}ersonalization(\textbf{MiLP}).
- Using large language models to simulate multiple humans and replicate human subject studies. In International Conference on Machine Learning, pages 337–371. PMLR.
- Extracting and modelling preferences from dialogue. In International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, pages 542–553. Springer.
- BoTorch: A Framework for Efficient Monte-Carlo Bayesian Optimization. In Advances in Neural Information Processing Systems 33.
- Emily M. Bender and Alexander Koller. 2020. Climbing towards NLU: On meaning, form, and understanding in the age of data. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5185–5198, Online. Association for Computational Linguistics.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
- Revisiting parameter-efficient tuning: Are we really there yet? arXiv preprint arXiv:2202.07962.
- The contributions of population distribution, healthcare resourcing, and transportation infrastructure to spatial accessibility of health care. INQUIRY: The Journal of Health Care Organization, Provision, and Financing, 60:00469580221146041.
- Palm: Scaling language modeling with pathways. arXiv preprint arXiv:2204.02311.
- Towards teachable reasoning systems: Using a dynamic memory of user feedback for continual system improvement. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 9465–9480.
- Parallel bayesian optimization of multiple noisy objectives with expected hypervolume improvement. Advances in Neural Information Processing Systems, 34:2187–2200.
- Toward personalized answer generation in e-commerce via multi-perspective preference modeling. ACM Transactions on Information Systems (TOIS), 40(4):1–28.
- Mixture-of-domain-adapters: Decoupling and injecting domain knowledge to pre-trained language models memories. arXiv preprint arXiv:2306.05406.
- David Eriksson and Martin Jankowiak. 2021. High-dimensional bayesian optimization with sparse axis-aligned subspaces. In Uncertainty in Artificial Intelligence, pages 493–503. PMLR.
- Transformer feed-forward layers are key-value memories. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 5484–5495, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Retrieval augmented language model pre-training. In International conference on machine learning, pages 3929–3938. PMLR.
- Towards a unified view of parameter-efficient transfer learning. arXiv preprint arXiv:2110.04366.
- Parameter-efficient transfer learning for nlp. In International Conference on Machine Learning, pages 2790–2799. PMLR.
- Lora: Low-rank adaptation of large language models. In International Conference on Learning Representations.
- Pretraining language models with human preferences. In International Conference on Machine Learning, pages 17506–17533. PMLR.
- Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems, 33:9459–9474.
- Xiang Lisa Li and Percy Liang. 2021. Prefix-tuning: Optimizing continuous prompts for generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 4582–4597.
- Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
- Ilya Loshchilov and Frank Hutter. 2018. Decoupled weight decay regularization. In International Conference on Learning Representations.
- One chatbot per person: Creating personalized chatbots based on implicit user profiles. In Proceedings of the 44th international ACM SIGIR conference on research and development in information retrieval, pages 555–564.
- Memory-assisted prompt editing to improve gpt-3 after deployment. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 2833–2861.
- Self-refine: Iterative refinement with self-feedback. arXiv preprint arXiv:2303.17651.
- Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744.
- Automatic differentiation in pytorch.
- Adapterhub: A framework for adapting transformers. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 46–54.
- Mad-x: An adapter-based framework for multi-task cross-lingual transfer. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 7654–7673.
- Lamp: When large language models meet personalization. arXiv preprint arXiv:2304.11406.
- Learning to repair: Repairing model output errors after deployment using a dynamic memory of feedback. arXiv preprint arXiv:2112.09737.
- Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.
- Chain-of-thought prompting for responding to in-depth dialogue questions with llm. arXiv preprint arXiv:2305.11792.
- K-adapter: Infusing knowledge into pre-trained models with adapters. arXiv preprint arXiv:2002.01808.
- Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38–45, Online. Association for Computational Linguistics.
- Tidybot: Personalized robot assistance with large language models. arXiv preprint arXiv:2305.05658.
- Baize: An open-source chat model with parameter-efficient tuning on self-chat data. arXiv preprint arXiv:2304.01196.
- Kformer: Knowledge injection in transformer feed-forward layers. In CCF International Conference on Natural Language Processing and Chinese Computing, pages 131–143. Springer.
- Qilin-med: Multi-stage knowledge injection advanced medical large language model. arXiv preprint arXiv:2310.09089.
- Selecting better samples from pre-trained llms: A case study on question generation. arXiv preprint arXiv:2209.11000.
- Memory-augmented llm personalization with short-and long-term memory coordination. arXiv preprint arXiv:2309.11696.
- Dialogpt: Large-scale generative pre-training for conversational response generation. In ACL, system demonstration.
- Less is more: Learning to refine dialogue history for personalized dialogue generation. arXiv preprint arXiv:2204.08128.
- Autopeft: Automatic configuration search for parameter-efficient fine-tuning. arXiv preprint arXiv:2301.12132.
- Kai Zhang (542 papers)
- Lizhi Qing (9 papers)
- Yangyang Kang (32 papers)
- Xiaozhong Liu (71 papers)