Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Detection of Groups with Biased Representation in Ranking (2301.00719v2)

Published 30 Dec 2022 in cs.LG and cs.DB

Abstract: Real-life tools for decision-making in many critical domains are based on ranking results. With the increasing awareness of algorithmic fairness, recent works have presented measures for fairness in ranking. Many of those definitions consider the representation of different ``protected groups'', in the top-$k$ ranked items, for any reasonable $k$. Given the protected groups, confirming algorithmic fairness is a simple task. However, the groups' definitions may be unknown in advance. In this paper, we study the problem of detecting groups with biased representation in the top-$k$ ranked items, eliminating the need to pre-define protected groups. The number of such groups possible can be exponential, making the problem hard. We propose efficient search algorithms for two different fairness measures: global representation bounds, and proportional representation. Then we propose a method to explain the bias in the representations of groups utilizing the notion of Shapley values. We conclude with an experimental study, showing the scalability of our approach and demonstrating the usefulness of the proposed algorithms.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (39)
  1. Fast algorithms for mining association rules in large databases. In VLDB 1994, pages 487–499. Morgan Kaufmann, 1994.
  2. Are there gender differences in professional self-promotion? an empirical case study of linkedin profiles among recent MBA graduates. In Proceedings of the Eleventh International Conference on Web and Social Media, ICWSM 2017, Montréal, Québec, Canada, May 15-18, 2017, pages 460–463. AAAI Press, 2017.
  3. Designing fair ranking schemes. In Proceedings of the 2019 International Conference on Management of Data, SIGMOD Conference 2019, Amsterdam, The Netherlands, June 30 - July 5, 2019, pages 1259–1276. ACM, 2019.
  4. Designing fair ranking schemes. In Proceedings of the 2019 International Conference on Management of Data, pages 1259–1276, 2019.
  5. Assessing and remedying coverage for a given dataset. In ICDE, 2019.
  6. Big data’s disparate impact. Calif. L. Rev., 104:671, 2016.
  7. On the rise of fintechs: Credit scoring using digital footprints. The Review of Financial Studies, 33(7):2845–2897, 2020.
  8. The anatomy of a large-scale hypertextual web search engine. Comput. Networks, 30(1-7):107–117, 1998.
  9. Fairvis: Visual analytics for discovering intersectional bias in machine learning. In VAST, 2019.
  10. Ranking with fairness constraints. In 45th International Colloquium on Automata, Languages, and Programming, ICALP 2018, July 9-13, 2018, Prague, Czech Republic, volume 107 of LIPIcs, pages 28:1–28:15. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2018.
  11. Investigating the impact of gender on rank in resume search engines. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, CHI 2018, Montreal, QC, Canada, April 21-26, 2018, page 651. ACM, 2018.
  12. Automated data slicing for model validation: A big data - AI integration approach. IEEE Trans. Knowl. Data Eng., 32(12):2284–2296, 2020.
  13. Paulo Cortez and Alice Maria Gonçalves Silva. Using data mining to predict secondary school student performance. 2008.
  14. Discovering (frequent) constant conditional functional dependencies. International Journal of Data Mining, Modelling and Management, 4(3):205–223, 2012.
  15. Conditional functional dependencies for capturing data inconsistencies. ACM Transactions on Database Systems (TODS), 33(2):1–48, 2008.
  16. Discovering conditional functional dependencies. IEEE Transactions on Knowledge and Data Engineering, 23(5):683–698, 2010.
  17. Fairness-aware ranking in search & recommendation systems with application to linkedin talent search. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019, Anchorage, AK, USA, August 4-8, 2019, pages 2221–2231. ACM, 2019.
  18. On generating near-optimal tableaux for conditional functional dependencies. Proceedings of the VLDB Endowment, 1(1):376–390, 2008.
  19. TANE: an efficient algorithm for discovering functional and approximate dependencies. Comput. J., 42(2):100–111, 1999.
  20. Cumulated gain-based evaluation of ir techniques. ACM Transactions on Information Systems (TOIS), 20(4):422–446, 2002.
  21. Mithracoverage: a system for investigating population bias for intersectional fairness. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, pages 2721–2724, 2020.
  22. FARE: diagnostics for fair ranking using pairwise error metrics. In The World Wide Web Conference, WWW 2019, San Francisco, CA, USA, May 13-17, 2019, pages 2936–2942. ACM, 2019.
  23. DENOUNCER: detection of unfairness in classifiers. Proc. VLDB Endow., 14(12):2719–2722, 2021.
  24. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, pages 4765–4774, 2017.
  25. Algorithmic fairness: Choices, assumptions, and definitions. Annual Review of Statistics and Its Application, 8:141–163, 2021.
  26. Functional dependency discovery: An experimental evaluation of seven algorithms. Proc. VLDB Endow., 8(10):1082–1093, 2015.
  27. Identifying biased subgroups in ranking and classification. arXiv preprint arXiv:2108.07450, 2021.
  28. Looking for trouble: Analyzing classifier behavior via pattern divergence. In Proceedings of the 2021 International Conference on Management of Data, pages 1400–1412, 2021.
  29. Effectiveness of medical school admissions criteria in predicting residency ranking four years later. Medical education, 41(1):57–64, 2007.
  30. Fairness in rankings and recommendations: an overview. VLDB J., 31(3):431–458, 2022.
  31. Revisiting conditional functional dependency discovery: Splitting the “c” from the “fd”. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 552–568. Springer, 2019.
  32. Ron Rymon. Search through systematic set enumeration. In KR. Morgan Kaufmann, 1992.
  33. L Shapley. 7. a value for n-person games. contributions to the theory of games ii (1953) 307-317. In Classics in Game Theory, pages 69–79. Princeton University Press, 2020.
  34. Fairness of exposure in rankings. In Yike Guo and Faisal Farooq, editors, Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2018, London, UK, August 19-23, 2018, pages 2219–2228. ACM, 2018.
  35. Explaining prediction models and individual predictions with feature contributions. Knowl. Inf. Syst., 41(3):647–665, 2014.
  36. Ke Yang and Julia Stoyanovich. Measuring fairness in ranked outputs. In Proceedings of the 29th International Conference on Scientific and Statistical Database Management, Chicago, IL, USA, June 27-29, 2017, pages 22:1–22:6. ACM, 2017.
  37. Mohammed Javeed Zaki. Scalable algorithms for association mining. IEEE transactions on knowledge and data engineering, 12(3):372–390, 2000.
  38. Fa* ir: A fair top-k ranking algorithm. In CIKM 2017, pages 1569–1578, 2017.
  39. Fa*ir: A fair top-k ranking algorithm. In CIKM 2017, Singapore, November 06 - 10, 2017, pages 1569–1578. ACM, 2017.
Citations (7)

Summary

We haven't generated a summary for this paper yet.