Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

End-to-end Training for Recommendation with Language-based User Profiles (2410.18870v2)

Published 24 Oct 2024 in cs.IR and cs.LG

Abstract: There is a growing interest in natural language-based user profiles for recommender systems, which aims to enhance transparency and scrutability compared with embedding-based methods. Existing studies primarily generate these profiles using zero-shot inference from LLMs, but their quality remains insufficient, leading to suboptimal recommendation performance. In this paper, we introduce LangPTune, the first end-to-end training framework to optimize LLM-generated user profiles. Our method significantly outperforms zero-shot approaches by explicitly training the LLM for the recommendation objective. Through extensive evaluations across diverse training configurations and benchmarks, we demonstrate that LangPTune not only surpasses zero-shot baselines but can also matches the performance of state-of-the-art embedding-based methods. Finally, we investigate whether the training procedure preserves the interpretability of these profiles compared to zero-shot inference through both GPT-4 simulations and crowdworker user studies. Implementation of LangPTune can be found at https://github.com/ZhaolinGao/LangPTune.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (61)
  1. On the Theory of Policy Gradient Methods: Optimality, Approximation, and Distribution Shift. arXiv:1908.00261 [cs.LG] https://arxiv.org/abs/1908.00261
  2. Anthropic. 2024. Introducing the next generation of Claude. https://www.anthropic.com/news/claude-3-family
  3. A General Theoretical Paradigm to Understand Learning from Human Preferences. arXiv:2310.12036 [cs.AI] https://arxiv.org/abs/2310.12036
  4. J. Andrew Bagnell and Jeff Schneider. 2003. Covariant policy search. In Proceedings of the 18th International Joint Conference on Artificial Intelligence (Acapulco, Mexico) (IJCAI’03). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1019–1024.
  5. Constitutional AI: Harmlessness from AI Feedback. arXiv:2212.08073 [cs.CL] https://arxiv.org/abs/2212.08073
  6. TALLRec: An Effective and Efficient Tuning Framework to Align Large Language Model with Recommendation. In Proceedings of the 17th ACM Conference on Recommender Systems (RecSys ’23). ACM. https://doi.org/10.1145/3604915.3608857
  7. Amazon’s Mechanical Turk: A new source of inexpensive, yet high-quality data? In Methodological Issues and Strategies in Clinical Research (4th ed.), Alan E. Kazdin (Ed.). American Psychological Association, 133–139. https://doi.org/10.1037/14805-009
  8. A Simple Framework for Contrastive Learning of Visual Representations. arXiv:2002.05709 [cs.LG] https://arxiv.org/abs/2002.05709
  9. Cheng-Han Chiang and Hung yi Lee. 2023. Can Large Language Models Be an Alternative to Human Evaluations? arXiv:2305.01937 [cs.CL] https://arxiv.org/abs/2305.01937
  10. Deep reinforcement learning from human preferences. In Proceedings of the 31st International Conference on Neural Information Processing Systems (Long Beach, California, USA) (NIPS’17). Curran Associates Inc., Red Hook, NY, USA, 4302–4310.
  11. Deep Neural Networks for YouTube Recommendations. In Proceedings of the 10th ACM Conference on Recommender Systems (Boston, Massachusetts, USA) (RecSys ’16). Association for Computing Machinery, New York, NY, USA, 191–198. https://doi.org/10.1145/2959100.2959190
  12. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv:1810.04805 [cs.CL] https://arxiv.org/abs/1810.04805
  13. Simplify and Robustify Negative Sampling for Implicit Collaborative Filtering. In Advances in Neural Information Processing Systems.
  14. A Multi-View Deep Learning Approach for Cross Domain User Modeling in Recommendation Systems. In Proceedings of the 24th International Conference on World Wide Web (Florence, Italy) (WWW ’15). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, 278–288. https://doi.org/10.1145/2736277.2741667
  15. KTO: Model Alignment as Prospect Theoretic Optimization. arXiv:2402.01306 [cs.LG] https://arxiv.org/abs/2402.01306
  16. REBEL: Reinforcement Learning via Regressing Relative Rewards. arXiv:2404.16767 [cs.LG] https://arxiv.org/abs/2404.16767
  17. MCL: Mixed-Centric Loss for Collaborative Filtering. In Proceedings of the ACM Web Conference 2022 (Virtual Event, Lyon, France) (WWW ’22). Association for Computing Machinery, New York, NY, USA, 2339–2347. https://doi.org/10.1145/3485447.3512106
  18. Recommendation as Language Processing (RLP): A Unified Pretrain, Personalized Prompt & Predict Paradigm (P5). https://doi.org/10.48550/arXiv.2203.13366 arXiv:2203.13366 [cs].
  19. Peter D. Grünwald and A. Philip Dawid. 2004. Game theory, maximum entropy, minimum discrepancy and robust Bayesian decision theory. The Annals of Statistics 32, 4 (Aug. 2004). https://doi.org/10.1214/009053604000000553
  20. Leveraging Large Language Models for Sequential Recommendation. In Proceedings of the 17th ACM Conference on Recommender Systems (RecSys ’23). ACM. https://doi.org/10.1145/3604915.3610639
  21. LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation. arXiv:2002.02126 [cs.IR] https://arxiv.org/abs/2002.02126
  22. Bridging Language and Items for Retrieval and Recommendation. arXiv preprint arXiv:2403.03952 (2024).
  23. LoRA: Low-Rank Adaptation of Large Language Models. arXiv:2106.09685 [cs.CL] https://arxiv.org/abs/2106.09685
  24. Collaborative Filtering for Implicit Feedback Datasets. In 2008 Eighth IEEE International Conference on Data Mining. 263–272. https://doi.org/10.1109/ICDM.2008.22
  25. The 37 Implementation Details of Proximal Policy Optimization. In ICLR Blog Track. https://iclr-blog-track.github.io/2022/03/25/ppo-implementation-details/ https://iclr-blog-track.github.io/2022/03/25/ppo-implementation-details/.
  26. GenRec: Large Language Model for Generative Recommendation. arXiv:2307.00457 [cs.IR] https://arxiv.org/abs/2307.00457
  27. Sham M Kakade. 2001. A Natural Policy Gradient. In Advances in Neural Information Processing Systems, T. Dietterich, S. Becker, and Z. Ghahramani (Eds.), Vol. 14. MIT Press. https://proceedings.neurips.cc/paper_files/paper/2001/file/4b86abe48d358ecf194c56c69108433e-Paper.pdf
  28. Do LLMs Understand User Preferences? Evaluating LLMs On User Rating Prediction. arXiv:2305.06474 [cs.IR] https://arxiv.org/abs/2305.06474
  29. Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. arXiv:1609.02907 [cs.LG] https://arxiv.org/abs/1609.02907
  30. Matrix Factorization Techniques for Recommender Systems. Computer 42, 8 (2009), 30–37. https://doi.org/10.1109/MC.2009.263
  31. Daniel Lee and H. Sebastian Seung. 2000. Algorithms for Non-negative Matrix Factorization. In Advances in Neural Information Processing Systems, T. Leen, T. Dietterich, and V. Tresp (Eds.), Vol. 13. MIT Press. https://proceedings.neurips.cc/paper_files/paper/2000/file/f9d1152547c0bde01830b7e8bd60024c-Paper.pdf
  32. Open Source Strikes Bread - New Fluffy Embeddings Model. https://www.mixedbread.ai/blog/mxbai-embed-large-v1
  33. CTRL: Connect Collaborative and Language Model for CTR Prediction. https://doi.org/10.48550/arXiv.2306.02841 arXiv:2306.02841 [cs].
  34. LLaRA: Large Language-Recommendation Assistant. arXiv:2312.02445 [cs.IR] https://arxiv.org/abs/2312.02445
  35. Is ChatGPT a Good Recommender? A Preliminary Study. arXiv:2304.10149 [cs.IR] https://arxiv.org/abs/2304.10149
  36. Session-based Recommendation with Transformers. In Proceedings of the Recommender Systems Challenge 2022 (Seattle, WA, USA) (RecSysChallenge ’22). Association for Computing Machinery, New York, NY, USA, 29–33. https://doi.org/10.1145/3556702.3556844
  37. LLM-Rec: Personalized Recommendation via Prompting Large Language Models. arXiv:2307.15780 [cs.CL] https://arxiv.org/abs/2307.15780
  38. Meta. 2024. Introducing Meta Llama 3: The most capable openly available LLM to date. https://ai.meta.com/blog/meta-llama-3/
  39. OpenAI. 2023. Gpt-4 technical report.
  40. On Natural Language User Profiles for Transparent and Scrutable Recommendation. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’22). ACM. https://doi.org/10.1145/3477495.3531873
  41. Direct Preference Optimization: Your Language Model is Secretly a Reward Model. arXiv:2305.18290 [cs.LG]
  42. Transparent and Scrutable Recommendations Using Natural Language User Profiles. arXiv:2402.05810 [cs.IR] https://arxiv.org/abs/2402.05810
  43. Representation Learning with Large Language Models for Recommendation. In Proceedings of the ACM Web Conference 2024 (WWW ’24, Vol. 33). ACM, 3464–3475. https://doi.org/10.1145/3589334.3645458
  44. BPR: Bayesian Personalized Ranking from Implicit Feedback. arXiv:1205.2618 [cs.IR] https://arxiv.org/abs/1205.2618
  45. Large Language Models are Competitive Near Cold-start Recommenders for Language- and Item-based Preferences. In Proceedings of the 17th ACM Conference on Recommender Systems (Singapore, Singapore) (RecSys ’23). Association for Computing Machinery, New York, NY, USA, 890–896. https://doi.org/10.1145/3604915.3608845
  46. Proximal Policy Optimization Algorithms. arXiv:1707.06347 [cs.LG] https://arxiv.org/abs/1707.06347
  47. BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations from Transformer. arXiv:1904.06690 [cs.IR] https://arxiv.org/abs/1904.06690
  48. Gemma Team. 2024a. Gemma 2: Improving Open Language Models at a Practical Size. arXiv:2408.00118 [cs.CL] https://arxiv.org/abs/2408.00118
  49. Llama Team. 2024b. The Llama 3 Herd of Models. arXiv:2407.21783 [cs.AI] https://arxiv.org/abs/2407.21783
  50. Representation Learning with Contrastive Predictive Coding. arXiv:1807.03748 [cs.LG] https://arxiv.org/abs/1807.03748
  51. Neural graph collaborative filtering. In Proceedings of the ACM SIGIR International conference on Research and development in Information Retrieval.
  52. Towards Open-World Recommendation with Knowledge Augmentation from Large Language Models. https://doi.org/10.48550/arXiv.2306.10933 arXiv:2306.10933 [cs].
  53. Prompting Large Language Models for Recommender Systems: A Comprehensive Framework and Empirical Analysis. https://doi.org/10.48550/arXiv.2401.04997 arXiv:2401.04997 [cs].
  54. PALR: Personalization Aware LLMs for Recommendation. arXiv:2305.07622 [cs.IR] https://arxiv.org/abs/2305.07622
  55. Sequential Recommendation with Latent Relations based on Large Language Model. arXiv:2403.18348 [cs.IR] https://arxiv.org/abs/2403.18348
  56. Not All Embeddings are Created Equal: Towards Robust Cross-domain Recommendation via Contrastive Learning. In Proceedings of the ACM Web Conference 2024 (Singapore, Singapore) (WWW ’24). Association for Computing Machinery, New York, NY, USA, 3195–3206. https://doi.org/10.1145/3589334.3645357
  57. CoLLM: Integrating Collaborative Embeddings into Large Language Models for Recommendation. arXiv:2310.19488 [cs.IR] https://arxiv.org/abs/2310.19488
  58. Language-Based User Profiles for Recommendation. arXiv:2402.15623 [cs.CL] https://arxiv.org/abs/2402.15623
  59. Starling-7B: Improving LLM Helpfulness & Harmlessness with RLAIF.
  60. Maximum entropy inverse reinforcement learning.. In Aaai, Vol. 8. Chicago, IL, USA, 1433–1438.
  61. Fine-Tuning Language Models from Human Preferences. arXiv:1909.08593 [cs.CL] https://arxiv.org/abs/1909.08593

Summary

We haven't generated a summary for this paper yet.