Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

The Road to Explainability is Paved with Bias: Measuring the Fairness of Explanations (2205.03295v2)

Published 6 May 2022 in cs.LG, cs.AI, and cs.CY

Abstract: Machine learning models in safety-critical settings like healthcare are often blackboxes: they contain a large number of parameters which are not transparent to users. Post-hoc explainability methods where a simple, human-interpretable model imitates the behavior of these blackbox models are often proposed to help users trust model predictions. In this work, we audit the quality of such explanations for different protected subgroups using real data from four settings in finance, healthcare, college admissions, and the US justice system. Across two different blackbox model architectures and four popular explainability methods, we find that the approximation quality of explanation models, also known as the fidelity, differs significantly between subgroups. We also demonstrate that pairing explainability methods with recent advances in robust machine learning can improve explanation fairness in some settings. However, we highlight the importance of communicating details of non-zero fidelity gaps to users, since a single solution might not exist across all settings. Finally, we discuss the implications of unfair explanation models as a challenging and understudied problem facing the machine learning community.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Aparna Balagopalan (17 papers)
  2. Haoran Zhang (102 papers)
  3. Kimia Hamidieh (7 papers)
  4. Thomas Hartvigsen (46 papers)
  5. Frank Rudzicz (90 papers)
  6. Marzyeh Ghassemi (96 papers)
Citations (67)

Summary

We haven't generated a summary for this paper yet.