HIVE: Evaluating the Human Interpretability of Visual Explanations (2112.03184v4)

Published 6 Dec 2021 in cs.CV

Abstract: As AI technology is increasingly applied to high-impact, high-risk domains, there have been a number of new methods aimed at making AI models more human interpretable. Despite the recent growth of interpretability work, there is a lack of systematic evaluation of proposed techniques. In this work, we introduce HIVE (Human Interpretability of Visual Explanations), a novel human evaluation framework that assesses the utility of explanations to human users in AI-assisted decision making scenarios, and enables falsifiable hypothesis testing, cross-method comparison, and human-centered evaluation of visual interpretability methods. To the best of our knowledge, this is the first work of its kind. Using HIVE, we conduct IRB-approved human studies with nearly 1000 participants and evaluate four methods that represent the diversity of computer vision interpretability works: GradCAM, BagNet, ProtoPNet, and ProtoTree. Our results suggest that explanations engender human trust, even for incorrect predictions, yet are not distinct enough for users to distinguish between correct and incorrect predictions. We open-source HIVE to enable future studies and encourage more human-centered approaches to interpretability research.

Authors (5)

Sunnie S. Y. Kim (16 papers)
Nicole Meister (9 papers)
Vikram V. Ramaswamy (9 papers)
Ruth Fong (21 papers)
Olga Russakovsky (62 papers)

Citations (96)

View on Semantic Scholar

Summary

Analysis of HIVE: Evaluating Human Interpretability of Visual Explanations

"HIVE: Evaluating the Human Interpretability of Visual Explanations" introduces a novel framework designed to methodologically assess the utility of visual interpretability methods as perceived by human users in AI-assisted decision-making scenarios. This paper addresses a significant gap in the interpretability domain by moving beyond automated metrics and focusing on the human-centered evaluation of visual explanations.

Initially, the authors identify a clear shortcoming in the current landscape of AI model evaluations—automatic metrics often fail to capture the nuances of human interaction with AI explanations. These automated metrics are inadequate for assessing the practical usefulness of explanations in real-world decision-making tasks. In response, the HIVE framework facilitates a rigorous, falsifiable methodology featuring human studies to assess the effectiveness of interpretability methods like GradCAM, BagNet, ProtoPNet, and ProtoTree.

The paper details the undertaking of large-scale human studies involving nearly 1,000 participants. The paper's results indicate a persistent tendency among human users to trust AI explanations even with incorrect model predictions—a critical observation highlighting the potential for confirmation bias. Specifically, participants found 60% of the incorrect explanation predictions convincing, which underlines the need for improved methods to aid users in distinguishing between correct and incorrect AI decisions effectively.

Moreover, the paper finds that humans struggle to differentiate model correctness from explanations, achieving only 40% accuracy on tasks that demanded such discernment. This issue suggests that existing interpretability methods do not yet possess the clarity needed for reliable use in AI-supported decision-making. The paper's empirical insights emphasize that while explanations generally promote trust, this trust remains unselective and even extends to incorrect predictions, pointing to the broader implications of how AI models might influence human decisions in applied settings.

Additionally, the research underscores a notable gap in human versus model similarity judgments, particularly within prototype-based models. This misalignment serves as a critical bottleneck in enhancing model interpretability because it suggests that the proxies or prototypes employed by the model fail to align with human cognitive processes concerning similarity or relevance.

From a methodological standpoint, this work is pioneering in demonstrating an extensible framework that evaluates diverse interpretability methodologies through human-centric tasks. Future research could build upon HIVE's insights by integrating domain expert users in real-world contexts, moving beyond academic settings to foster multi-disciplinary evaluation environments. Furthermore, the presented findings solicits a more integral question of how AI models can be designed to inherently mitigate such biases and improve explainability in a manner that is not just quantitatively, but qualitatively aligned with human understanding—not just in recognition tasks but across the spectrum of AI applications.

HIVE's open-source availability promises to encourage broader application and iterative enhancements of this evaluative methodology across the AI research community. As the focus of AI progresses beyond core performance metrics to include qualitatively robust human interpretability, frameworks like HIVE will be pivotal in shaping the future trajectory of AI development and deployment.

PDF Markdown

Related Papers

Find Related Papers

YouTube

Show All Videos