Papers
Topics
Authors
Recent
Search
2000 character limit reached

Towards Measuring Fairness in Grid Layout in Recommender Systems

Published 19 Sep 2023 in cs.IR | (2309.10271v1)

Abstract: There has been significant research in the last five years on ensuring the providers of items in a recommender system are treated fairly, particularly in terms of the exposure the system provides to their work through its results. However, the metrics developed to date have all been designed and tested for linear ranked lists. It is unknown whether and how existing fair ranking metrics for linear layouts can be applied to grid-based displays. Moreover, depending on the device (phone, tab, or laptop) users use to interact with systems, column size is adjusted using column reduction approaches in a grid-view. The visibility or exposure of recommended items in grid layouts varies based on column sizes and column reduction approaches as well. In this paper, we extend existing fair ranking concepts and metrics to study provider-side group fairness in grid layouts, present an analysis of the behavior of these grid adaptations of fair ranking metrics, and study how their behavior changes across different grid ranking layout designs and geometries. We examine how fairness scores change with different ranking layouts to yield insights into (1) the consistency of fair ranking measurements across layouts; (2) whether rankings optimized for fairness in a linear ranking remain fair when the results are displayed in a grid; and (3) the impact of column reduction approaches to support different device geometries on fairness measurement. This work highlights the need to use layout-specific user attention models when measuring fairness of rankings, and provide practitioners with a first set of insights on what to expect when translating existing fair ranking metrics to the grid layouts in wide use today.

Authors (2)
Definition Search Book Streamline Icon: https://streamlinehq.com
References (35)
  1. Azin Ashkan and Charles LA Clarke. 2011. On the informativeness of cascade and intent-aware effectiveness measures. In Proceedings of the 20th International Conference on World wide web. 407–416.
  2. Using Product Meta Information for Bias Removal in E-Commerce Grid Search. IEEE Data Eng. Bull. 44, 2 (2021), 81–91.
  3. Fairness in Recommendation Ranking Through Pairwise Comparisons. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2212–2220. https://doi.org/10.1145/3292500.3330745
  4. Equity of Attention: Amortizing Individual Fairness in Rankings. In Proceedings of the 41st International ACM SIGIR Conference on Research and Development in Information Retrieval. 405–414. https://doi.org/10.1145/3209978.3210063
  5. Ben Carterette. 2011. System effectiveness, user models, and user utility: a conceptual framework for investigation. In Proceedings of the 34th international ACM SIGIR conference on Research and development in information retrieval. 903–912.
  6. Expected Reciprocal Rank for Graded Relevance. In Proceedings of the 18th ACM Conference on Information and Knowledge Management. 621–630.
  7. Reinforcement Re-ranking with 2D Grid-based Recommendation Panels. arXiv preprint arXiv:2204.04954 (2022).
  8. An experimental comparison of click position-bias models. In Proceedings of the 2008 international conference on web search and data mining. 87–94.
  9. Mukund Deshpande and George Karypis. 2004. Item-based Top-n Recommendation Algorithms. ACM Transactions on Information Systems (TOIS) 22, 1 (2004), 143–177. https://doi.org/10.1145/963770.963776
  10. Evaluating Stochastic Rankings with Expected Exposure. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management (Virtual Event, Ireland) (CIKM ’20). Association for Computing Machinery, New York, NY, USA, 275–284. https://doi.org/10.1145/3340531.3411962
  11. Visual hierarchy and viewing behavior: An eye tracking study. In Human-Computer Interaction. Design and Development Approaches: 14th International Conference, HCI International 2011, Orlando, FL, USA, July 9-14, 2011, Proceedings, Part I 14. Springer, 331–340.
  12. Georges E Dupret and Benjamin Piwowarski. 2008. A user browsing model to predict search engine click data from past observations.. In Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval. 331–338.
  13. Michael D. Ekstrand. 2020. LensKit for Python: Next-Generation Software for Recommender Systems Experiments. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management (Virtual Event, Ireland) (CIKM ’20). Association for Computing Machinery, New York, NY, USA, 2999–3006. https://doi.org/10.1145/3340531.3412778
  14. Michael D. Ekstrand and Daniel Kluver. 2020. Exploring Author Gender in Book Rating and Recommendation. User Modeling and User-Adapted Interaction (feb 2020). https://doi.org/10.1007/s11257-020-09284-2
  15. Exploring Author Gender in Book Rating and Recommendation. In Proceedings of the 12th ACM Conference on Recommender Systems. 242–250.
  16. Debiasing grid-based product search in e-commerce. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2852–2860.
  17. An Algorithmic Framework for Performing Collaborative Filtering. In Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery, New York, NY, USA, 230–237.
  18. Measuring Group Advantage: A Comparative Study of Fair Ranking Metrics. In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society (AIES’21).
  19. Image-based recommendations on styles and substitutes. In Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval. 43–52.
  20. Alistair Moffat and Justin Zobel. 2008. Rank-biased Precision for Measurement of Retrieval Effectiveness. ACM Transactions on Information Systems (TOIS) 27, 1 (2008), 1–27.
  21. Pairwise Fairness for Ranking and Regression.. In AAAI. 5248–5255.
  22. Much Ado About Gender: Current Practices and Future Recommendations for Appropriate Gender-Aware Information Access. arXiv preprint arXiv:2301.04780 (2023).
  23. Amifa Raj and Michael D Ekstrand. 2022. Measuring Fairness in Ranked Results: An Analytical and Empirical Comparison. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 726–736.
  24. BPR: Bayesian Personalized Ranking from Implicit Feedback. In Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence (Montreal, Quebec, Canada) (UAI ’09). AUAI Press, Arlington, Virginia, USA, 452–461.
  25. Quantifying the Impact of User Attention on Fair Group Representation in Ranked Lists. In Companion Proceedings of The 2019 World Wide Web Conference (San Francisco, USA) (WWW ’19). Association for Computing Machinery, New York, NY, USA, 553–562. https://doi.org/10.1145/3308560.3317595
  26. Sav Shrestha and Kelsi Lenz. 2007. Eye gaze patterns while searching vs. browsing a website. Usability News 9, 1 (2007), 1–9.
  27. Ashudeep Singh and Thorsten Joachims. 2018. Fairness of Exposure in Rankings. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (London, United Kingdom) (KDD ’18). Association for Computing Machinery, New York, NY, USA, 2219–2228. https://doi.org/10.1145/3219819.3220088
  28. Applications of the Conjugate Gradient Method for Implicit Feedback Collaborative Filtering. In Proceedings of the Fifth ACM Conference on Recommender Systems (Chicago, Illinois, USA) (RecSys ’11). Association for Computing Machinery, New York, NY, USA, 297–300. https://doi.org/10.1145/2043932.2043987
  29. Benjamin W Tatler. 2007. The central fixation bias in scene viewing: Selecting an optimal viewing position independently of motor biases and image feature distributions. Journal of vision 7, 14 (2007), 4–4.
  30. Mengting Wan and Julian McAuley. 2018. Item Recommendation on Monotonic Behavior Chains. In Proceedings of the 12th ACM Conference on Recommender Systems (Vancouver, British Columbia, Canada) (RecSys ’18). Association for Computing Machinery, New York, NY, USA, 86–94. https://doi.org/10.1145/3240323.3240369
  31. Investigating examination behavior of image search users. In Proceedings of the 40th international acm sigir conference on research and development in information retrieval. 275–284.
  32. Grid-based evaluation metrics for web image search. In The world wide web conference. 2103–2114.
  33. Ke Yang and Julia Stoyanovich. 2017. Measuring Fairness in Ranked Outputs. In Proceedings of the 29th International Conference on Scientific and Statistical Database Management. 1–6.
  34. Fair Top-k Ranking with multiple protected groups. Information Processing & Management 59, 1 (2022), 102707.
  35. Gaze prediction for recommender systems. In Proceedings of the 10th ACM Conference on Recommender Systems. 131–138.

Summary

  • The paper introduces adapted fairness measures for grid layouts, moving beyond traditional linear ranking assessments.
  • It reveals that grid design and device-induced layout changes significantly influence recommendation exposure.
  • The study highlights the need for tailored user attention models in recommender systems to ensure equitable content visibility.

Introduction

Recommender systems, which power most of the content we engage with on digital platforms, from streaming services to e-commerce websites, hold a significant impact on which items gain visibility and which languish in obscurity. At the heart of these systems lies an algorithm that selects and ranks content based on presumed user preference, simultaneously influencing the exposure that content creators receive. Amidst growing scrutiny, ensuring that these recommendation algorithms distribute exposure fairly has become a pressing concern.

Evaluating Fairness in Grid Layouts

Traditionally, fairness in recommendations has been assessed through linear rankings, where items are listed vertically, as if on a search engine results page. However, in real-world applications, content is just as likely to be arranged in grid layouts, responding dynamically to different devices with varying screen sizes. Implicit in this dynamic presentation is an assumption that has largely gone untested—that fairness metrics developed for linear layouts will hold up when applied to grids.

The study discussed here tackles this challenge head-on, attempting to bridge this gap. It aims to adapt existing fairness concepts and metrics to grid layouts, examining the reliability of these adapted metrics across various layout designs. Specifically, it evaluates the visibility and subsequent fairness of item exposure as influenced by the user's device, be it a phone, tablet, or desktop, which determines the grid's column size and format.

Fairness Measurement Insights

A cornerstone finding from this research is that the measures of fairness can greatly differ depending on layout design and geometry. This is substantial as it suggests that an algorithm appearing to distribute exposure fairly in a linear arrangement may fail to do so in a grid format or vice versa. The study delineates several types of grid layouts and introduces adaptations to three existing user browsing models, namely Row-Skipping and Slower-Decay, to enable fairness assessment in grid-based recommenders.

The inquiry extends to observing how variations in grid layouts due to changes in device screens impact these fairness evaluations. The analysis reveals that both the type of grid layout and the column adjustment approaches—for example, truncating off-screen items or re-wrapping them into new rows—affect the perceived fairness of recommendations.

Practical Implications for Providers

For practitioners, this research signals a crucial need to consider layout-specific user attention models when evaluating the fairness of their recommender systems. It underscores that a one-size-fits-all approach in fairness metrics does not translate well across different geometries and adaptive layouts. Furthermore, the study shows that different approaches to column size adjustments for various devices come with their own implications on the fairness measurements.

The implications of these findings are far-reaching. Content providers and platform designers must carefully consider how the display geometry interacts with the user browsing behavior, as this interplay significantly mediates the exposure an item gets. Establishing fair exposure becomes a multi-faceted problem where display layout, device type, and user interaction behavior all demand meticulous scrutiny.

Conclusion

In conclusion, this paper presents a groundbreaking step towards understanding and integrating fairness in grid layout recommender systems. It enlightens us on the complexities of evaluating fairness in non-linear display formats and calls for a tailored approach, taking into account the nuances of grid layouts and user attention patterns. Such insights are pivotal for developing fair recommender systems that serve both users and content providers with equity.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.