Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Enhancing Recommendation Diversity by Re-ranking with Large Language Models (2401.11506v2)

Published 21 Jan 2024 in cs.IR and cs.LG

Abstract: It has long been recognized that it is not enough for a Recommender System (RS) to provide recommendations based only on their relevance to users. Among many other criteria, the set of recommendations may need to be diverse. Diversity is one way of handling recommendation uncertainty and ensuring that recommendations offer users a meaningful choice. The literature reports many ways of measuring diversity and improving the diversity of a set of recommendations, most notably by re-ranking and selecting from a larger set of candidate recommendations. Driven by promising insights from the literature on how to incorporate versatile LLMs into the RS pipeline, in this paper we show how LLMs can be used for diversity re-ranking. We begin with an informal study that verifies that LLMs can be used for re-ranking tasks and do have some understanding of the concept of item diversity. Then, we design a more rigorous methodology where LLMs are prompted to generate a diverse ranking from a candidate ranking using various prompt templates with different re-ranking instructions in a zero-shot fashion. We conduct comprehensive experiments testing state-of-the-art LLMs from the GPT and Llama families. We compare their re-ranking capabilities with random re-ranking and various traditional re-ranking methods from the literature. We open-source the code of our experiments for reproducibility. Our findings suggest that the trade-offs (in terms of performance and costs, among others) of LLM-based re-rankers are superior to those of random re-rankers but, as yet, inferior to the ones of traditional re-rankers. However, the LLM approach is promising. LLMs exhibit improved performance on many natural language processing and recommendation tasks and lower inference costs. Given these trends, we can expect LLM-based re-ranking to become more competitive soon.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (61)
  1. Diversifying search results. In Proceedings of the second ACM international conference on web search and data mining. 5–14.
  2. TALLRec: An Effective and Efficient Tuning Framework to Align Large Language Model with Recommendation. In Proceedings of the 17th ACM Conference on Recommender Systems (RecSys ’23). Association for Computing Machinery, New York, NY, USA, 1007–1014. https://doi.org/10.1145/3604915.3608857
  3. Language models are realistic tabular data generators. arXiv preprint arXiv:2210.06280 (2022).
  4. Jaime Carbonell and Jade Goldstein. 1998. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval. 335–336.
  5. Diego Carraro. 2020. Active learning in recommender systems: an unbiased and beyond-accuracy perspective. Ph. D. Dissertation. University College Cork.
  6. Novelty and diversity in recommender systems. In Recommender systems handbook. Springer, 603–646.
  7. Generate natural language explanations for recommendation. arXiv preprint arXiv:2101.03392 (2021).
  8. Novelty and diversity in information retrieval evaluation. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval. 659–666.
  9. M6-rec: Generative pretrained language models are open-ended recommender systems. arXiv preprint arXiv:2205.08084 (2022).
  10. Uncovering ChatGPT’s Capabilities in Recommender Systems. arXiv preprint arXiv:2305.02182 (2023).
  11. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
  12. Adaptive multi-attribute diversity for recommender systems. Information Sciences 382 (2017), 234–253.
  13. Zero-shot recommender systems. arXiv preprint arXiv:2105.08318 (2021).
  14. User perception of differences in recommender algorithms. In Proceedings of the 8th ACM Conference on Recommender systems. 161–168.
  15. Recommender systems in the era of large language models (llms). arXiv preprint arXiv:2307.02046 (2023).
  16. Retrieval-Augmented Generation for Large Language Models: A Survey. arXiv:2312.10997 [cs.CL]
  17. Recommendation as language processing (rlp): A unified pretrain, personalized prompt & predict paradigm (p5). In Proceedings of the 16th ACM Conference on Recommender Systems. 299–315.
  18. Knowledge Distillation of Large Language Models. arXiv:2306.08543 [cs.CL]
  19. ReXPlug: Explainable Recommendation Using Plug-and-Play Language Model. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’21). Association for Computing Machinery, New York, NY, USA, 81–91. https://doi.org/10.1145/3404835.3462939
  20. Learning vector-quantized item representation for transferable sequential recommenders. In Proceedings of the ACM Web Conference 2023. 1162–1171.
  21. Towards universal sequence representation learning for recommender systems. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 585–593.
  22. Neil J Hurley. 2013. Personalised ranking with diversity. In Proceedings of the 7th ACM Conference on Recommender Systems. 379–382.
  23. Tamas Jambor and Jun Wang. 2010. Optimizing multiple objectives in collaborative filtering. In Proceedings of the fourth ACM conference on Recommender systems. 55–62.
  24. Survey of hallucination in natural language generation. Comput. Surveys 55, 12 (2023), 1–38.
  25. Marius Kaminskas and Derek Bridge. 2016. Diversity, serendipity, novelty, and coverage: a survey and empirical analysis of beyond-accuracy objectives in recommender systems. ACM Transactions on Interactive Intelligent Systems (TiiS) 7, 1 (2016), 1–42.
  26. Mesut Kaya and Derek Bridge. 2019. Subprofile-aware diversification of recommendations. User Modeling and User-Adapted Interaction 29 (2019), 661–700.
  27. John Paul Kelly and Derek Bridge. 2006. Enhancing the diversity of conversational collaborative recommendations: a comparison. Artificial Intelligence Review 25 (2006), 79–95.
  28. Text Is All You Need: Learning Language Representations for Sequential Recommendation. arXiv preprint arXiv:2305.13731 (2023).
  29. GPT4Rec: A generative framework for personalized recommendation and user interests interpretation. arXiv preprint arXiv:2304.03879 (2023).
  30. Personalized prompt learning for explainable recommendation. ACM Transactions on Information Systems 41, 4 (2023), 1–26.
  31. A Preliminary Study of ChatGPT on News Recommendation: Personalization, Provider Fairness, Fake News. arXiv preprint arXiv:2306.10702 (2023).
  32. How Can Recommender Systems Benefit from Large Language Models: A Survey. arXiv preprint arXiv:2306.05817 (2023).
  33. A First Look at LLM-Powered Generative News Recommendation. arXiv preprint arXiv:2305.06566 (2023).
  34. Pre-trained language model for web-scale retrieval in baidu search. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 3365–3375.
  35. Recent advances in natural language processing via large pre-trained language models: A survey. Comput. Surveys 56, 2 (2023), 1–40.
  36. Large Language Model Augmented Narrative Driven Recommendations. arXiv preprint arXiv:2306.02250 (2023).
  37. Aleksandr V Petrov and Craig Macdonald. 2023. Generative Sequential Recommendation with GPTRec. arXiv preprint arXiv:2306.11114 (2023).
  38. Fast als-based matrix factorization for explicit and implicit feedback datasets. In Proceedings of the fourth ACM conference on Recommender systems. 71–78.
  39. Improving language understanding by generative pre-training. (2018).
  40. Improving Code Example Recommendations on Informal Documentation Using BERT and Query-Aware LSH: A Comparative Study. arXiv preprint arXiv:2305.03017 (2023).
  41. Pareto-efficient hybridization for multi-objective recommender systems. In Proceedings of the sixth ACM conference on Recommender systems. 19–26.
  42. Adaptive diversification of recommendation results via latent factor portfolio. In Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval. 175–184.
  43. Barry Smyth and Paul McClave. 2001. Similarity vs. diversity. In International conference on case-based reasoning. Springer, 347–361.
  44. Set-oriented personalized ranking for diversified top-n recommendation. In Proceedings of the 7th ACM Conference on Recommender Systems. 415–418.
  45. Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agent. arXiv preprint arXiv:2304.09542 (2023).
  46. Prompt-and-rerank: A method for zero-shot and few-shot arbitrary textual style transfer with small language models. arXiv preprint arXiv:2205.11503 (2022).
  47. Saúl Vargas. 2015. Novelty and diversity evaluation and enhancement in recommender systems. Ph. D. Dissertation. PhD thesis, Universidad Autónoma de Madrid, Spain.
  48. Coverage, redundancy and size-awareness in genre diversity for recommender systems. In Proceedings of the 8th ACM Conference on Recommender systems. 209–216.
  49. Saúl Vargas and Pablo Castells. 2011. Rank and relevance in novelty and diversity metrics for recommender systems. In Proceedings of the fifth ACM conference on Recommender systems. 109–116.
  50. Intent-oriented diversity in recommender systems. In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval. 1211–1212.
  51. Explicit relevance models in intent-oriented information retrieval diversification. In Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval. 75–84.
  52. Lei Wang and Ee-Peng Lim. 2023. Zero-Shot Next-Item Recommendation using Large Pretrained Language Models. arXiv preprint arXiv:2304.03153 (2023).
  53. A survey on the fairness of recommender systems. ACM Transactions on Information Systems 41, 3 (2023), 1–43.
  54. A similarity measure for indefinite rankings. ACM Transactions on Information Systems (TOIS) 28, 4 (2010), 1–38.
  55. Mi Zhang and Neil Hurley. 2008. Avoiding monotony: improving the diversity of recommendation lists. In Proceedings of the 2008 ACM conference on Recommender systems. 123–130.
  56. UNBERT: User-News Matching BERT for News Recommendation.. In IJCAI. 3356–3362.
  57. TwHIN-BERT: a socially-enriched pre-trained language model for multilingual Tweet representations. arXiv preprint arXiv:2209.07562 (2022).
  58. Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models. arXiv preprint arXiv:2309.01219 (2023).
  59. Zizhuo Zhang and Bang Wang. 2023. Prompt learning for news recommendation. arXiv preprint arXiv:2304.05263 (2023).
  60. Improving recommendation lists through topic diversification. In Proceedings of the 14th international conference on World Wide Web. 22–32.
  61. Fine-tuning language models from human preferences. arXiv preprint arXiv:1909.08593 (2019).
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Diego Carraro (2 papers)
  2. Derek Bridge (9 papers)
Citations (5)
X Twitter Logo Streamline Icon: https://streamlinehq.com