Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Help or Hinder? Evaluating the Impact of Fairness Metrics and Algorithms in Visualizations for Consensus Ranking (2308.06233v1)

Published 11 Aug 2023 in cs.HC

Abstract: For applications where multiple stakeholders provide recommendations, a fair consensus ranking must not only ensure that the preferences of rankers are well represented, but must also mitigate disadvantages among socio-demographic groups in the final result. However, there is little empirical guidance on the value or challenges of visualizing and integrating fairness metrics and algorithms into human-in-the-loop systems to aid decision-makers. In this work, we design a study to analyze the effectiveness of integrating such fairness metrics-based visualization and algorithms. We explore this through a task-based crowdsourced experiment comparing an interactive visualization system for constructing consensus rankings, ConsensusFuse, with a similar system that includes visual encodings of fairness metrics and fair-rank generation algorithms, FairFuse. We analyze the measure of fairness, agreement of rankers' decisions, and user interactions in constructing the fair consensus ranking across these two systems. In our study with 200 participants, results suggest that providing these fairness-oriented support features nudges users to align their decision with the fairness metrics while minimizing the tedious process of manually having to amend the consensus ranking. We discuss the implications of these results for the design of next-generation fairness oriented-systems and along with emerging directions for future research.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (58)
  1. Yongsu Ahn and Yu-Ru Lin. 2019. Fairsight: Visual analytics for fairness in decision making. IEEE transactions on visualization and computer graphics 26, 1 (2019), 1086–1095.
  2. Interactive visualization for group decision analysis. International Journal of Information Technology & Decision Making 17, 06 (2018), 1839–1864.
  3. Niels Bantilan. 2018. Themis-ml: A fairness-aware machine learning interface for end-to-end discrimination discovery and mitigation. Journal of Technology in Human Services 36, 1 (2018), 15–30.
  4. Visual comparison of orderings and rankings. In EuroVis.
  5. AI Fairness 360: An extensible toolkit for detecting, understanding, and mitigating unwanted algorithmic bias. arXiv preprint arXiv:1810.01943 (2018).
  6. Fairlearn: A toolkit for assessing and improving fairness in AI. Microsoft, Tech. Rep. MSR-TR-2020-32 (2020).
  7. Jean-Charles de Borda et al. 1781. Mathematical derivation of an election system. Isis 44, 1-2 (1781), 42–51.
  8. Matthew Brehmer and Tamara Munzner. 2013. A multi-level typology of abstract visualization tasks. IEEE transactions on visualization and computer graphics 19, 12 (2013), 2376–2385.
  9. FairVis: Visual analytics for discovering intersectional bias in machine learning. In 2019 IEEE Conference on Visual Analytics Science and Technology (VAST). IEEE, 46–56.
  10. MANI-Rank: Multiple Attribute and Intersectional Group Fairness for Consensus Ranking. In 2022 IEEE 38th Intl. Conf. on Data Engineering (ICDE). IEEE.
  11. Giuseppe Carenini and John Loyd. 2004. Valuecharts: analyzing linear models expressing preferences and evaluations. In Proceedings of the working conference on Advanced visual interfaces. 150–157.
  12. Soliciting stakeholders’ fairness notions in child maltreatment predictive systems. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–17.
  13. How Do Algorithmic Fairness Metrics Align with Human Judgement? A Mixed-Initiative System for Contextualized Fairness Assessment. In 2022 IEEE Workshop on TRust and EXpertise in Visual Analytics (TREX). IEEE, 1–7.
  14. Arthur H Copeland. 1951. A reasonable social welfare function. Technical Report. Mimeo, University of Michigan USA.
  15. Interactive model cards: A human-centered approach to model documentation. In 2022 ACM Conference on Fairness, Accountability, and Transparency. 427–439.
  16. Exploring how machine learning practitioners (try to) use fairness toolkits. In 2022 ACM Conference on Fairness, Accountability, and Transparency. 473–484.
  17. Dcpairs: A pairs plot based decision support system. In EuroVis-19th EG/VGTC Conference on Visualization.
  18. Lineup: Visual analysis of multi-attribute rankings. IEEE transactions on visualization and computer graphics 19, 12 (2013), 2277–2286.
  19. Who Gets What, According to Whom? An Analysis of Fairness Perceptions in Service Allocation. In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society. 555–565.
  20. Paul Hansen and Franz Ombler. 2008. A new method for scoring additive multi-attribute value models using pairwise rankings of alternatives. Journal of Multi-Criteria Decision Analysis 15, 3-4 (2008), 87–107.
  21. An empirical study on the perceived fairness of realistic, imperfect machine learning models. In Proceedings of the 2020 conference on fairness, accountability, and transparency. 392–402.
  22. D-Sight: a new decision making software to address multi-criteria problems. International Journal of Decision Support System Technology (IJDSST) 4, 4 (2012), 1–23.
  23. Towards Rigorously Designed Preference Visualizations for Group Decision Making. In 2020 IEEE Pacific Visualization Symposium (PacificVis). IEEE, 181–190.
  24. Abstractions for Visualizing Preferences in Group Decisions. Proceedings of the ACM on Human-Computer Interaction 6, CSCW1 (2022), 1–44.
  25. Collaborative dynamic queries: Supporting distributed small group decision-making. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 1–12.
  26. Brittany Johnson and Yuriy Brun. 2022. Fairkit-learn: a fairness evaluation and comparison toolkit. In Proceedings of the ACM/IEEE 44th International Conference on Software Engineering: Companion Proceedings. 70–74.
  27. John G Kemeny. 1959. Mathematics without numbers. Daedalus 88, 4 (1959), 577–591.
  28. Maurice G Kendall. 1938. A new measure of rank correlation. Biometrika 30, 1/2 (1938), 81–93.
  29. Royce Kimmons. 2012. Exam scores. http://roycekimmons.com/tools/generated_data/exams
  30. Caitlin Kuhlman and Elke Rundensteiner. 2020. Rank aggregation algorithms for fair consensus. Proceedings of the VLDB Endowment 13, 12 (2020).
  31. Bridging from goals to tasks with design study analysis reports. IEEE trans. on visualization and computer graphics 24, 1 (2017), 435–445.
  32. Michelle Seng Ah Lee and Jat Singh. 2021. The landscape and gaps in open source fairness toolkits. In Proceedings of the 2021 CHI conference on human factors in computing systems. 1–13.
  33. ConsensUs: Supporting multi-criteria group decisions by visualizing points of disagreement. ACM Transactions on Social Computing 1, 1 (2018), 1–26.
  34. Taxonomy-based glyph design—with a case study on visualizing workflows of biological experiments. IEEE Transactions on Visualization and Computer Graphics 18, 12 (2012), 2603–2612.
  35. A Case Study of Integrating Fairness Visualization Tools in Machine Learning Education. In CHI Conference on Human Factors in Computing Systems Extended Abstracts. 1–7.
  36. Model cards for model reporting. In Proceedings of the conference on fairness, accountability, and transparency. 220–229.
  37. This thing called fairness: Disciplinary confusion realizing a value in technology. Proceedings of the ACM on Human-Computer Interaction 3, CSCW (2019), 1–36.
  38. Visual Auditor: Interactive Visualization for Detection and Summarization of Model Biases. In 2022 IEEE Visualization and Visual Analytics (VIS). IEEE, 45–49.
  39. Tamara Munzner. 2009. A nested model for visualization design and validation. IEEE transactions on visualization and computer graphics 15, 6 (2009), 921–928.
  40. Jyri Mustajoki and Raimo P Hämäläinen. 2000. Web-HIPRE: Global decision support by value tree and AHP analysis. INFOR: Information Systems and Operational Research 38, 3 (2000), 208–220.
  41. Evaluating multivariate network visualization techniques using a validated design and crowdsourcing approach. In Proceedings of the 2020 CHI conference on human factors in computing systems. 1–12.
  42. Phi Giang Pham and Mao Lin Huang. 2016. Qstack: Multi-tag Visual Rankings. J. Softw. 11, 7 (2016), 695–703.
  43. Towards fairness in practice: A practitioner-oriented rubric for evaluating Fair ML Toolkits. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–13.
  44. Aequitas: A bias and fairness audit toolkit. arXiv preprint arXiv:1811.05577 (2018).
  45. How do fairness definitions fare? Examining public attitudes towards algorithmic definitions of fairness. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society. 99–106.
  46. Markus Schulze. 2018. The Schulze method of voting. arXiv preprint arXiv:1804.02973 (2018).
  47. Chirag Shah. 2014. Collaborative information seeking. Journal of the Association for Information Science and Technology 65, 2 (2014), 215–236.
  48. FairFuse: Interactive Visual Support for Fair Consensus Ranking. In 2022 IEEE Visualization and Visual Analytics (VIS). IEEE, 65–69.
  49. Mathematical notions vs. human perception of fairness: A descriptive approach to fairness for machine learning. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining. 2459–2468.
  50. Fairtest: Discovering unwarranted associations in data-driven applications. In 2017 IEEE European Symposium on Security and Privacy (EuroS&P). IEEE, 401–416.
  51. Effect of information presentation on fairness perceptions of machine learning predictors. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–13.
  52. Sahil Verma and Julia Rubin. 2018. Fairness definitions explained. In 2018 ieee/acm international workshop on software fairness (fairware). IEEE, 1–7.
  53. Podium: Ranking data using mixed-initiative visual analytics. IEEE transactions on visualization and computer graphics 24, 1 (2017), 288–297.
  54. Srvis: Towards better spatial integration in ranking visualization. IEEE transactions on visualization and computer graphics 25, 1 (2018), 459–469.
  55. Homefinder revisited: Finding ideal homes with reachability-centric multi-criteria decision making. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 1–12.
  56. The what-if tool: Interactive probing of machine learning models. IEEE transactions on visualization and computer graphics 26, 1 (2019), 56–65.
  57. FairRankVis: A Visual Analytics Framework for Exploring Algorithmic Fairness in Graph Mining Models. IEEE Transactions on Visualization and Computer Graphics 28, 1 (2021), 368–377.
  58. A nutritional label for rankings. In Proceedings of the 2018 international conference on management of data. 1773–1776.
Citations (1)

Summary

We haven't generated a summary for this paper yet.