Dissenting Explanations: Leveraging Disagreement to Reduce Model Overreliance (2307.07636v3)
Abstract: While explainability is a desirable characteristic of increasingly complex black-box models, modern explanation methods have been shown to be inconsistent and contradictory. The semantics of explanations is not always fully understood - to what extent do explanations "explain" a decision and to what extent do they merely advocate for a decision? Can we help humans gain insights from explanations accompanying correct predictions and not over-rely on incorrect predictions advocated for by explanations? With this perspective in mind, we introduce the notion of dissenting explanations: conflicting predictions with accompanying explanations. We first explore the advantage of dissenting explanations in the setting of model multiplicity, where multiple models with similar performance may have different predictions. In such cases, providing dissenting explanations could be done by invoking the explanations of disagreeing models. Through a pilot study, we demonstrate that dissenting explanations reduce overreliance on model predictions, without reducing overall accuracy. Motivated by the utility of dissenting explanations we present both global and local methods for their generation.
- Does the whole exceed its parts? the effect of ai explanations on complementary team performance. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, 1–16.
- Implications of Model Indeterminacy for Explanations of Automated Decisions. Advances in Neural Information Processing Systems, 35: 7810–7823.
- Proxy tasks and subjective measures can be misleading in evaluating explainable ai systems. In Proceedings of the 25th international conference on intelligent user interfaces, 454–464.
- A Survey on the Explainability of Supervised Machine Learning. CoRR, abs/2011.07876.
- Retiring adult: New datasets for fair machine learning. Advances in neural information processing systems, 34: 6478–6490.
- All Models are Wrong, but Many are Useful: Learning a Variable’s Importance by Studying an Entire Class of Prediction Models Simultaneously. J. Mach. Learn. Res., 20(177): 1–81.
- The principles and limits of algorithm-in-the-loop decision making. Proceedings of the ACM on Human-Computer Interaction, 3(CSCW): 1–24.
- Which Explanation Should I Choose? A Function Approximation Perspective to Characterizing Post hoc Explanations. arXiv preprint arXiv:2206.01254.
- The Disagreement Problem in Explainable Machine Learning: A Practitioner’s Perspective. arXiv preprint arXiv:2202.01602.
- ” Why is’ Chicago’deceptive?” Towards Building Model-Driven Tutorials for Humans. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, 1–13.
- On human predictions with explanations and predictions of machine learning models: A case study on deception detection. In Proceedings of the conference on fairness, accountability, and transparency, 29–38.
- A unified approach to interpreting model predictions. Advances in neural information processing systems, 30.
- Explainable machine-learning predictions for the prevention of hypoxaemia during surgery. Nature biomedical engineering, 2(10): 749–760.
- Predictive multiplicity in classification. In International Conference on Machine Learning, 6765–6774. PMLR.
- McHugh, M. L. 2012. Interrater reliability: the kappa statistic. Biochemia medica, 22(3): 276–282.
- Debate Helps Supervise Unreliable Experts. arXiv preprint arXiv:2311.08702.
- Miller, T. 2023. Explainable ai is dead, long live explainable ai! hypothesis-driven decision support. arXiv preprint arXiv:2302.12389.
- Finding deceptive opinion spam by any stretch of the imagination. arXiv preprint arXiv:1107.4557.
- Improving adversarial robustness via promoting ensemble diversity. In International Conference on Machine Learning, 4970–4979. PMLR.
- Dice: Diversity in deep ensembles via conditional redundancy adversarial estimation. arXiv preprint arXiv:2101.05544.
- ” Why should i trust you?” Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, 1135–1144.
- Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision, 618–626.
- Smoothgrad: removing noise by adding noise. arXiv preprint arXiv:1706.03825.
- Explanations Can Reduce Overreliance on AI Systems During Decision-Making. arXiv preprint arXiv:2212.06823.
- Are explanations helpful? a comparative study of the effects of explanations in ai-assisted decision-making. In 26th international conference on intelligent user interfaces, 318–328.
- Effect of confidence and explanation on accuracy and trust calibration in AI-assisted decision making. In Proceedings of the 2020 conference on fairness, accountability, and transparency, 295–305.
- Omer Reingold (35 papers)
- Judy Hanwen Shen (21 papers)
- Aditi Talati (4 papers)