Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Understanding Biases in ChatGPT-based Recommender Systems: Provider Fairness, Temporal Stability, and Recency (2401.10545v3)

Published 19 Jan 2024 in cs.IR and cs.LG

Abstract: This paper explores the biases in ChatGPT-based recommender systems, focusing on provider fairness (item-side fairness). Through extensive experiments and over a thousand API calls, we investigate the impact of prompt design strategies-including structure, system role, and intent-on evaluation metrics such as provider fairness, catalog coverage, temporal stability, and recency. The first experiment examines these strategies in classical top-K recommendations, while the second evaluates sequential in-context learning (ICL). In the first experiment, we assess seven distinct prompt scenarios on top-K recommendation accuracy and fairness. Accuracy-oriented prompts, like Simple and Chain-of-Thought (COT), outperform diversification prompts, which, despite enhancing temporal freshness, reduce accuracy by up to 50%. Embedding fairness into system roles, such as "act as a fair recommender," proved more effective than fairness directives within prompts. Diversification prompts led to recommending newer movies, offering broader genre distribution compared to traditional collaborative filtering (CF) models. The second experiment explores sequential ICL, comparing zero-shot and few-shot ICL. Results indicate that including user demographic information in prompts affects model biases and stereotypes. However, ICL did not consistently improve item fairness and catalog coverage over zero-shot learning. Zero-shot learning achieved higher NDCG and coverage, while ICL-2 showed slight improvements in hit rate (HR) when age-group context was included. Our study provides insights into biases of RecLLMs, particularly in provider fairness and catalog coverage. By examining prompt design, learning strategies, and system roles, we highlight the potential and challenges of integrating LLMs into recommendation systems. Further details can be found at https://github.com/yasdel/Benchmark_RecLLM_Fairness.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (59)
  1. Himan Abdollahpouri and Robin Burke. 2021. Multistakeholder recommender systems. In Recommender systems handbook. Springer, 647–677.
  2. A unifying and general account of fairness measurement in recommender systems. Information Processing & Management 60, 1 (2023), 103115.
  3. Attempt: Parameter-efficient multi-task tuning via attentional mixtures of soft prompts. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. 6655–6672.
  4. Toine Bogers and Marijn Koolen. 2017. Defining and supporting narrative-driven recommendation. In Proceedings of the eleventh ACM conference on recommender systems. 238–242.
  5. Interplay between upsampling and regularization for provider fairness in recommender systems. User Modeling and User-Adapted Interaction 31, 3 (2021), 421–455.
  6. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877–1901.
  7. Balanced neighborhoods for multi-sided fairness in recommendation. In Conference on fairness, accountability and transparency. PMLR, 202–214.
  8. Fair sharing for sharing economy platforms. (2017).
  9. A survey of chain of thought reasoning: Advances, frontiers and future. arXiv preprint arXiv:2309.15402 (2023).
  10. Exploiting personalized calibration and metrics for fairness recommendation. Expert Systems with Applications 181 (2021), 115112.
  11. Aminu Da’u and Naomie Salim. 2020. Recommendation system based on deep learning methods: a systematic review and new directions. Artificial Intelligence Review 53, 4 (2020), 2709–2748.
  12. A flexible framework for evaluating user and item fairness in recommender systems. User Modeling and User-Adapted Interaction (2021), 1–55.
  13. Explaining recommender systems fairness and accuracy through the lens of data characteristics. Information Processing & Management 58, 5 (2021), 102662.
  14. A survey on adversarial recommender systems: from attack/defense strategies to generative adversarial networks. Comput. Surveys 2 (2022), 1–38.
  15. Fairness in recommender systems: research landscape and future directions. User Modeling and User-Adapted Interaction (2023), 1–50.
  16. Evaluating chatgpt as a recommender system: A rigorous approach. arXiv preprint arXiv:2309.03613 (2023).
  17. Two-sided fairness in rankings via Lorenz dominance. Advances in Neural Information Processing Systems 34 (2021).
  18. User-item matching for recommendation fairness. IEEE Access 9 (2021), 130389–130398.
  19. Fairness and discrimination in recommendation and retrieval. In Proceedings of the 13th ACM Conference on Recommender Systems. 576–577.
  20. A fairness-aware hybrid recommender system. arXiv preprint arXiv:1809.09030 (2018).
  21. Towards long-term fairness in recommendation. In Proceedings of the 14th ACM international conference on web search and data mining. 445–453.
  22. Recommendation as language processing (rlp): A unified pretrain, personalized prompt & predict paradigm (p5). In Proceedings of the 16th ACM Conference on Recommender Systems. 299–315.
  23. The winner takes it all: geographic imbalance and provider (un) fairness in educational recommender systems. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1808–1812.
  24. Pareto optimality for fairness-constrained collaborative filtering. In Proceedings of the 29th ACM International Conference on Multimedia. 5619–5627.
  25. Lightgcn: Simplifying and powering graph convolution network for recommendation. In Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval. 639–648.
  26. Neural collaborative filtering. In Proceedings of the 26th international conference on world wide web. 173–182.
  27. Towards universal sequence representation learning for recommender systems. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 585–593.
  28. Estimation of fair ranking metrics with incomplete judgments. In Proceedings of the Web Conference 2021. 1065–1075.
  29. Large language models for generative recommendation: A survey and visionary discussions. arXiv preprint arXiv:2309.01157 (2023).
  30. A Preliminary Study of ChatGPT on News Recommendation: Personalization, Provider Fairness, Fake News. arXiv preprint arXiv:2306.10702 (2023).
  31. User-oriented fairness in recommendation. In Proceedings of the Web Conference 2021. 624–632.
  32. Variational autoencoders for collaborative filtering. In Proceedings of the 2018 world wide web conference. 689–698.
  33. Mitigating sentiment bias for recommender systems. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 31–40.
  34. Pre-train, prompt and recommendation: A comprehensive survey of language modelling paradigm adaptations in recommender systems. arXiv preprint arXiv:2302.03735 (2023).
  35. Balancing between accuracy and fairness for interactive recommendation with reinforcement learning. In Advances in Knowledge Discovery and Data Mining: 24th Pacific-Asia Conference, PAKDD 2020, Singapore, May 11–14, 2020, Proceedings, Part I 24. Springer, 155–167.
  36. Cpfair: Personalized consumer and producer fairness re-ranking for recommender systems. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 770–779.
  37. Fairrec: Two-sided fairness for personalized recommendations in two-sided platforms. In Proceedings of The Web Conference 2020. 1194–1204.
  38. The Unfairness of Active Users and Popularity Bias in Point-of-Interest Recommendation. In Bias@ECIR’22.
  39. Experiments on generalizability of user-oriented fairness in recommender systems. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2755–2764.
  40. BPR: Bayesian personalized ranking from implicit feedback. arXiv preprint arXiv:1205.2618 (2012).
  41. Analysis of recommendation algorithms for e-commerce. In Proceedings of the 2nd ACM Conference on Electronic Commerce. 158–167.
  42. Exploring artist gender bias in music recommendation. arXiv preprint arXiv:2009.01715 (2020).
  43. Towards understanding and mitigating unintended biases in language model-driven conversational recommendation. Information Processing & Management 60, 1 (2023), 103139.
  44. Does fair ranking improve minority outcomes? understanding the interplay of human and algorithmic biases in online hiring. In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society. 989–999.
  45. Spot: Better frozen model adaptation through soft prompt transfer. arXiv preprint arXiv:2110.07904 (2021).
  46. Addressing marketing bias in product recommendations. In Proceedings of the 13th international conference on web search and data mining. 618–626.
  47. Neural graph collaborative filtering. In Proceedings of the 42nd international ACM SIGIR conference on Research and development in Information Retrieval. 165–174.
  48. Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171 (2022).
  49. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35 (2022), 24824–24837.
  50. Defining and measuring fairness in location recommendations. In Proceedings of the 3rd ACM SIGSPATIAL international workshop on location-based recommendations, geosocial networks and geoadvertising. 1–8.
  51. Fairness-aware news recommendation with decomposed adversarial learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 4462–4469.
  52. A Survey on Large Language Models for Recommendation. arXiv preprint arXiv:2305.19860 (2023).
  53. TFROM: A Two-sided Fairness-Aware Recommendation Model for Both Customers and Providers. arXiv preprint arXiv:2104.09024 (2021).
  54. An enhanced probabilistic fairness-aware group recommendation by incorporating social activeness. Journal of Network and Computer Applications 156 (2020), 102579.
  55. Prompting Large Language Models for Recommender Systems: A Comprehensive Framework and Empirical Analysis. arXiv preprint arXiv:2401.04997 (2024).
  56. OpenP5: Benchmarking Foundation Models for Recommendation. arXiv preprint arXiv:2306.11134 (2023).
  57. Is chatgpt fair for recommendation? evaluating fairness in large language model recommendation. arXiv preprint arXiv:2305.07609 (2023).
  58. Yong Zheng. 2019. Multi-stakeholder recommendations: case studies, methods and challenges. In Proceedings of the 13th ACM Conference on Recommender Systems. 578–579.
  59. Fairness among new items in cold start recommender systems. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 767–776.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (1)
  1. Yashar Deldjoo (46 papers)
Citations (21)

Summary

We haven't generated a summary for this paper yet.