Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

OpenP5: An Open-Source Platform for Developing, Training, and Evaluating LLM-based Recommender Systems (2306.11134v2)

Published 19 Jun 2023 in cs.IR

Abstract: In recent years, the integration of LLMs into recommender systems has garnered interest among both practitioners and researchers. Despite this interest, the field is still emerging, and the lack of open-source R&D platforms may impede the exploration of LLM-based recommendations. This paper introduces OpenP5, an open-source platform designed as a resource to facilitate the development, training, and evaluation of LLM-based generative recommender systems for research purposes. The platform is implemented using encoder-decoder LLMs (e.g., T5) and decoder-only LLMs (e.g., Llama-2) across 10 widely recognized public datasets, catering to two fundamental recommendation tasks: sequential and straightforward recommendations. Recognizing the crucial role of item IDs in LLM-based recommendations, we have also incorporated three item indexing methods within the OpenP5 platform: random indexing, sequential indexing and collaborative indexing. Built on the Transformers library, the platform facilitates easy customization of LLM-based recommendations for users. OpenP5 boasts a range of features including extensible data processing, task-centric optimization, comprehensive datasets and checkpoints, efficient acceleration, and standardized evaluations, making it a valuable tool for the implementation and evaluation of LLM-based recommender systems. The open-source code and pre-trained checkpoints for the OpenP5 library are publicly available at https://github.com/agiresearch/OpenP5.

A Detailed Analysis of OpenP5: An Open-Source Library for Foundation Model Benchmarking in Recommendation Systems

The paper "OpenP5: Benchmarking Foundation Models for Recommendation" provides a comprehensive overview of an open-source library designed for evaluating foundation models within the recommendation domain. Through this library, the authors aim to address the absence of standardized benchmarks in the burgeoning field of recommendation foundation models, which builds upon the Pre-train, Personalized Prompt, and Predict Paradigm (P5).

Key Components of OpenP5

OpenP5 is characterized by its implementation across three critical dimensions: downstream task, recommendation dataset, and item indexing method. These dimensions provide an exhaustive framework for the deployment and evaluation of recommendation models.

  1. Downstream Tasks: The authors focus on two primary downstream tasks—sequential recommendation and straightforward recommendation. Sequential recommendation involves predicting the next item for a user based on their interaction history, whereas straightforward recommendation bases predictions solely on the user ID.
  2. Recommendation Datasets: The library is implemented on ten well-curated datasets that are highly representative of the field, resulting from an analysis of the frequency of dataset usage in recent academic publications. This selection ensures that the library remains relevant and effectively benchmarks model performance across diverse data scenarios.
  3. Item Indexing Methods: OpenP5 provides three distinct item indexing methods—random indexing, sequential indexing, and collaborative indexing. Each method caters to different models of identifying and representing items within datasets. These methods are pivotal in enabling LLMs to perform recommendation tasks in a language processing framework.

Experimental Setup and Results

The authors have systematically implemented and evaluated the library across multiple experiments using these components. Notably, OpenP5 supports single-dataset implementation, corresponding checkpoints (P5), and a combined model, Super P5 (SP5), tailored for cross-domain recommendations. The paper discusses how OpenP5 leverages language as a medium to integrate various recommendation tasks into a single model.

The evaluation encompasses a thorough mapping of these dimensions, shown through experiments with influential baseline models in the recommendation. Numerical results highlight OpenP5's effectiveness in most cases, indicating superior performance and adaptability due to its thoughtful integration of collaborative information through item indexing methods. The tests on sequential and straightforward recommendation tasks reflect OpenP5’s proficiency, with the collaborative indexing method yielding notably impactful results.

Implications and Future Directions

OpenP5 tackles a crucial challenge within the recommendation field by providing a robust, open-source benchmark that catalyzes future research. The introduction of multi-dimensional benchmarking options will greatly assist practitioners and researchers in identifying foundational model strengths and weaknesses.

This paper opens avenues for richer exploration in recommendation systems. Future work could explore the inclusion of diverse item indexing methods, support for additional LLMs like OPT, LLaMA, or expansions into other data modalities. With the flexibility to adapt and integrate with a broader range of LLMs, the OpenP5 library sets a stage for ongoing advancements in AI-driven recommendation systems, potentially increasing their efficacy and scalability.

In conclusion, the OpenP5 library is a significant step towards establishing a consistent benchmark for foundation models within recommendation systems, bridging a critical gap in model assessment and setting a foundation for progressive research in this domain. The library's integration of varying tasks, datasets, and indexing methods strengthens its capacity to drive innovation and more nuanced understanding of generative recommendation systems.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (27)
  1. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877–1901.
  2. Wide & deep learning for recommender systems. In Proceedings of the 1st workshop on deep learning for recommender systems. 7–10.
  3. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014).
  4. Scaling instruction-finetuned language models. arXiv preprint arXiv:2210.11416 (2022).
  5. M6-Rec: Generative Pretrained Language Models are Open-Ended Recommender Systems. arXiv preprint arXiv:2205.08084 (2022).
  6. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 4171–4186.
  7. Recommendation as language processing (rlp): A unified pretrain, personalized prompt & predict paradigm (p5). In Proceedings of the 16th ACM Conference on Recommender Systems. 299–315.
  8. Session-based recommendations with recurrent neural networks. In ICLR.
  9. How to Index Item IDs for Recommendation Foundation Models. arXiv preprint arXiv:2305.06569 (2023).
  10. Wang-Cheng Kang and Julian McAuley. 2018. Self-attentive sequential recommendation. In 2018 IEEE international conference on data mining (ICDM). IEEE, 197–206.
  11. Hierarchical gating networks for sequential recommendation. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining. 825–833.
  12. SimpleX: A simple and strong baseline for collaborative filtering. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management. 1243–1252.
  13. On spectral clustering: Analysis and an algorithm. Advances in neural information processing systems 14 (2001).
  14. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research 21, 1 (2020), 5485–5551.
  15. BPR: Bayesian personalized ranking from implicit feedback. In Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence. 452–461.
  16. Bloom: A 176b-parameter open-access multilingual language model. arXiv preprint arXiv:2211.05100 (2022).
  17. Neural Machine Translation of Rare Words with Subword Units. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 1715–1725.
  18. BERT4Rec: Sequential recommendation with bidirectional encoder representations from transformer. In Proceedings of the 28th ACM international conference on information and knowledge management. 1441–1450.
  19. Jiaxi Tang and Ke Wang. 2018. Personalized top-n sequential recommendation via convolutional sequence embedding. In Proceedings of the eleventh ACM international conference on web search and data mining. 565–573.
  20. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023).
  21. Ulrike Von Luxburg. 2007. A tutorial on spectral clustering. Statistics and computing 17 (2007), 395–416.
  22. Florence: A new foundation model for computer vision. arXiv preprint arXiv:2111.11432 (2021).
  23. Opt: Open pre-trained transformer language models. arXiv preprint arXiv:2205.01068 (2022).
  24. Feature-level Deeper Self-Attention Network for Sequential Recommendation.. In IJCAI. 4320–4326.
  25. Language models as recommender systems: Evaluations and limitations. In NeurIPS 2021 Workshop on I (Still) Can’t Believe It’s Not Better.
  26. S3-rec: Self-supervised learning for sequential recommendation with mutual information maximization. In Proceedings of the 29th ACM international conference on information & knowledge management. 1893–1902.
  27. Learning to prompt for vision-language models. International Journal of Computer Vision 130, 9 (2022), 2337–2348.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Shuyuan Xu (31 papers)
  2. Wenyue Hua (51 papers)
  3. Yongfeng Zhang (163 papers)
Citations (18)