User-LLM: Efficient LLM Contextualization with User Embeddings (2402.13598v2)
Abstract: LLMs have achieved remarkable success across various domains, but effectively incorporating complex and potentially noisy user timeline data into LLMs remains a challenge. Current approaches often involve translating user timelines into text descriptions before feeding them to LLMs, which can be inefficient and may not fully capture the nuances of user behavior. Inspired by how LLMs are effectively integrated with images through direct embeddings, we propose User-LLM, a novel framework that leverages user embeddings to directly contextualize LLMs with user history interactions. These embeddings, generated by a user encoder pretrained using self-supervised learning on diverse user interactions, capture latent user behaviors and interests as well as their evolution over time. We integrate these user embeddings with LLMs through cross-attention, enabling LLMs to dynamically adapt their responses based on the context of a user's past actions and preferences. Our approach achieves significant efficiency gains by representing user timelines directly as embeddings, leading to substantial inference speedups of up to 78.1X. Comprehensive experiments on MovieLens, Amazon Review, and Google Local Review datasets demonstrate that User-LLM outperforms text-prompt-based contextualization on tasks requiring deep user understanding, with improvements of up to 16.33%, particularly excelling on long sequences that capture subtle shifts in user behavior. Furthermore, the incorporation of Perceiver layers streamlines the integration between user encoders and LLMs, yielding additional computational savings.
- Flamingo: a visual language model for few-shot learning. Advances in Neural Information Processing Systems, 35:23716–23736, 2022.
- Palm 2 technical report. arXiv preprint arXiv:2305.10403, 2023.
- Text summarization using large language models: A comparative study of mpt-7b-instruct, falcon-7b-instruct, and openai chat-gpt models, 2023.
- Language models are few-shot learners. In Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., and Lin, H. (eds.), Advances in Neural Information Processing Systems, volume 33, pp. 1877–1901. Curran Associates, Inc., 2020.
- Lightgcl: Simple yet effective graph contrastive learning for recommendation. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net, 2023.
- Extending context window of large language models via positional interpolation. arXiv preprint arXiv:2306.15595, 2023a.
- Pali: A jointly-scaled multilingual language-image model. arXiv preprint arXiv:2209.06794, 2022a.
- Intent contrastive learning for sequential recommendation. In Laforest, F., Troncy, R., Simperl, E., Agarwal, D., Gionis, A., Herman, I., and Médini, L. (eds.), WWW ’22: The ACM Web Conference 2022, Virtual Event, Lyon, France, April 25 - 29, 2022, pp. 2172–2182. ACM, 2022b. doi: 10.1145/3485447.3512090.
- Longlora: Efficient fine-tuning of long-context large language models. arXiv preprint arXiv:2309.12307, 2023b.
- Adapting language models to compress contexts. arXiv preprint arXiv:2305.14788, 2023.
- Learning cross-lingual sentence representations via a multi-task dual-encoder model. arXiv preprint arXiv:1810.12836, 2018.
- Palm: Scaling language modeling with pathways. Journal of Machine Learning Research, 24(240):1–113, 2023.
- Deep neural networks for youtube recommendations. In Proceedings of the 10th ACM Conference on Recommender Systems (RecSys’16), pp. 191––198, 2016a.
- Deep neural networks for youtube recommendations. In Proceedings of the 10th ACM Conference on Recommender Systems, New York, NY, USA, 2016b.
- Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018.
- Longnet: Scaling transformers to 1,000,000,000 tokens. arXiv preprint arXiv:2307.02486, 2023.
- User embedding model for personalized language prompting. arXiv preprint arXiv:2401.04858, 2024.
- Palm-e: An embodied multimodal language model. arXiv preprint arXiv:2303.03378, 2023.
- In-context autoencoder for context compression in a large language model. arXiv preprint arXiv:2307.06945, 2023.
- Gemini Team Google. Gemini: A family of highly capable multimodal models. arXiv preprint arXiv:2312.11805, 2023.
- End-to-end retrieval in continuous space. arXiv preprint arXiv:1811.08008, 2018.
- Learning dense representations for entity retrieval. arXiv preprint arXiv:1909.10506, 2019.
- Imagebind: One embedding space to bind them all. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15180–15190, 2023.
- Google, G. T. Gemini: A family of highly capable multimodal models, 2023.
- Mamba: Linear-time sequence modeling with selective state spaces, 2023.
- How to train your HIPPO: state space models with generalized orthogonal basis projections. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net, 2023.
- Onellm: One framework to align all modalities with language. arXiv preprint arXiv:2312.03700, 2023.
- The movielens datasets: History and context. ACM Trans. Interact. Intell. Syst., 5(4), 2015. ISSN 2160-6455. doi: 10.1145/2827872.
- Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering. In Proceedings of the 25th International Conference on World Wide Web, 2016.
- LoRA: Low-rank adaptation of large language models. In International Conference on Learning Representations, 2022.
- Perceiver: General perception with iterative attention. In International conference on machine learning, pp. 4651–4664. PMLR, 2021.
- Genrec: Large language model for generative recommendation, 2023.
- End-to-end deep attentive personalized item retrieval for online content-sharing platforms. In Proceedings of The Web Conference 2020, pp. 2870–2877, 2020.
- Llm maybe longlm: Self-extend llm context window without tuning. arXiv preprint arXiv:2401.01325, 2024.
- Do llms understand user preferences? evaluating llms on user rating prediction, 2023.
- Scaling laws for neural language models, 2020.
- Large language models are zero-shot reasoners. In Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., and Oh, A. (eds.), Advances in Neural Information Processing Systems, volume 35, pp. 22199–22213. Curran Associates, Inc., 2022.
- The power of scale for parameter-efficient prompt tuning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021.
- Uctopic: Unsupervised contrastive learning for phrase representations and topic mining. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 6159–6169, 2022.
- Exploring the upper limits of text-based collaborative filtering using large language models: Discoveries and insights, 2023.
- Ring attention with blockwise transformers for near-infinite context. arXiv preprint arXiv:2310.01889, 2023a.
- Is chatgpt a good recommender? a preliminary study, 2023b.
- Lost in the middle: How language models use long contexts. arXiv preprint arXiv:2307.03172, 2023c.
- On learning to summarize with large language models as references, 2023d.
- Llm-rec: Personalized recommendation via prompting large language models, 2023.
- Modeling task relationships in multi-task learning with multi-gate mixture-of-experts. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1930–1939, 2018.
- Anymal: An efficient and scalable any-modality augmented language model. arXiv preprint arXiv:2309.16058, 2023.
- Learning to compress prompts with gist tokens. arXiv preprint arXiv:2304.08467, 2023.
- Learning federated representations and recommendations with limited negatives. ArXiv e-prints, 2021.
- OpenAI. Gpt-4 technical report. ArXiv, abs/2303.08774, 2023.
- Generative sequential recommendation with gptrec, 2023.
- U-bert: Pre-training user representations for improved recommendation. Proceedings of the AAAI Conference on Artificial Intelligence, 35, 2021.
- Learning transferable visual models from natural language supervision. In International conference on machine learning, pp. 8748–8763. PMLR, 2021.
- A generalist agent. arXiv preprint arXiv:2205.06175, 2022.
- Length bias in encoder decoder models and a case for global conditioning. arXiv preprint arXiv:1606.03402, 2016.
- Bert4rec: Sequential recommendation with bidirectional encoder representations from transformer. In Zhu, W., Tao, D., Cheng, X., Cui, P., Rundensteiner, E. A., Carmel, D., He, Q., and Yu, J. X. (eds.), Proceedings of the 28th ACM International Conference on Information and Knowledge Management, CIKM 2019, Beijing, China, November 3-7, 2019, pp. 1441–1450. ACM, 2019. doi: 10.1145/3357384.3357895.
- Efficient transformers: A survey. ACM Comput. Surv., 55(6), dec 2022. ISSN 0360-0300. doi: 10.1145/3530811.
- Llama 2: Open foundation and fine-tuned chat models, 2023.
- Focused transformer: Contrastive training for context scaling. arXiv preprint arXiv:2307.03170, 2023.
- Dropoutnet: Addressing cold start in recommender systems. In NIPS, pp. 4957–4966, 2017.
- Augmenting language models with long-term memory. arXiv preprint arXiv:2306.07174, 2023.
- Chain-of-thought prompting elicits reasoning in large language models, 2023.
- A survey on large language models for recommendation, 2023a.
- Next-gpt: Any-to-any multimodal llm. arXiv preprint arXiv:2309.05519, 2023b.
- Automated self-supervised learning for recommendation. In Ding, Y., Tang, J., Sequeda, J. F., Aroyo, L., Castillo, C., and Houben, G. (eds.), Proceedings of the ACM Web Conference 2023, WWW 2023, Austin, TX, USA, 30 April 2023 - 4 May 2023, pp. 992–1002. ACM, 2023. doi: 10.1145/3543507.3583336.
- Contrastive learning for sequential recommendation. In 2022 IEEE 38th international conference on data engineering (ICDE), pp. 1259–1273. IEEE, 2022.
- Retrieval meets long context large language models. arXiv preprint arXiv:2310.03025, 2023a.
- Openp5: Benchmarking foundation models for recommendation. arXiv:2306.11134, 2023b.
- Personalized showcases: Generating multi-modal explanations for recommendations. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 2251–2255, 2023.
- Mixed negative sampling for learning two-tower neural networks in recommendations. In Companion Proceedings of the Web Conference 2020, pp. 441–447, 2020.
- Debiased contrastive learning for sequential recommendation. In Proceedings of the ACM Web Conference 2023, WWW ’23, pp. 1063–1073, New York, NY, USA, 2023. Association for Computing Machinery. ISBN 9781450394161. doi: 10.1145/3543507.3583361.
- Learning semantic textual similarity from conversations. ACL 2018, pp. 164, 2018.
- Self-supervised learning for large-scale item recommendations. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, CIKM ’21, pp. 4321–4330, New York, NY, USA, 2021. Association for Computing Machinery. ISBN 9781450384469. doi: 10.1145/3459637.3481952.
- Sampling-Bias-Corrected Neural Modeling for Large Corpus Item Recommendations, 2019.
- Coca: Contrastive captioners are image-text foundation models. arxiv 2022. arXiv preprint arXiv:2205.01917, 2022.
- Lin Ning (9 papers)
- Luyang Liu (20 papers)
- Jiaxing Wu (6 papers)
- Neo Wu (5 papers)
- Devora Berlowitz (2 papers)
- Sushant Prakash (15 papers)
- Bradley Green (20 papers)
- Shawn O'Banion (8 papers)
- Jun Xie (66 papers)