How do Humans Understand Explanations from Machine Learning Systems? An Evaluation of the Human-Interpretability of Explanation (1802.00682v1)

Published 2 Feb 2018 in cs.AI

Abstract: Recent years have seen a boom in interest in machine learning systems that can provide a human-understandable rationale for their predictions or decisions. However, exactly what kinds of explanation are truly human-interpretable remains poorly understood. This work advances our understanding of what makes explanations interpretable in the specific context of verification. Suppose we have a machine learning system that predicts X, and we provide rationale for this prediction X. Given an input, an explanation, and an output, is the output consistent with the input and the supposed rationale? Via a series of user-studies, we identify what kinds of increases in complexity have the greatest effect on the time it takes for humans to verify the rationale, and which seem relatively insensitive.

PDF Abstract

Understanding Human Interpretability of Machine Learning Explanations

This paper investigates the human interpretability of machine learning explanations, specifically focusing on the aspect of verification. The crux of the research lies in evaluating the factors that influence human capabilities in understanding these explanations effectively. By conducting a series of user studies, the authors meticulously discern which components of complexity most significantly impact the time it takes for humans to verify a machine learning prediction's rationale.

The motivation behind interpretable machine learning is rooted in fostering trust and ensuring the safety of outputs by providing justifiable explanations behind predictions. Despite the plethora of explanation methods proposed, there remains a lack of systematic understanding regarding the implementation of various explanation types and their limitations in human reasoning contexts.

Methodological Approach

The research prioritizes intrinsic verification tasks over extrinsic goal-oriented tasks to mitigate issues such as environmental understanding variance among participants. The paper employs decision sets as the explanation form, given their ease of human processing for specific instances. This choice aligns with prior findings on rule sets’ interpretability.

A pivotal focus is on decision set characteristics that affect human processing speed and accuracy, encompassing factors such as:

Explanation Size: Variations in the number of lines and complexity within each line (number of terms).
Cognitive Chunks: The introduction and explicit representation of new concepts versus implicit integration.
Variable Repetitions: The frequency with which terms are reused within an explanation.

Two distinct domains—recipe recommendations and clinical decision support—are utilized to examine if the contextual change influences interpretability given controlled identical conditions across domains.

Results and Insights

The paper's primary results indicate:

A direct correlation between increased explanation complexity and higher response times/lower satisfactions. Complexity factors include both the number of lines and term repetitions.
Counterintuitively: Making new cognitive chunks explicit, which theoretically should aid understanding, resulted in increased response times and dissatisfaction, potentially suggesting an overload in searching through excessive explanation lines.
Variations in explanation complexity exhibited negligible effects on accuracy, suggesting complexity primarily influences required cognitive effort rather than correctness in verification.

Implications and Future Directions

From a practical perspective, these findings suggest that interpretability-focused designs should be cautious about complexity, especially regarding decision set size and explicit new concepts. Theoretical implications highlight the necessity for human cognitive load consideration in developing interpretable models.

Future explorations could delve into understanding universal interpretability principles transcending specific explanation formats or tasks and assessing cognitive preferences for concept definitions across various model dimensionalities. Furthermore, differentiating tasks that shift cognition toward detailed scrutiny versus rapid decision-making may refine the interaction between machine explanation designs and human reasoning processes.

This paper forges a foundational step in quantitatively mapping the intricacies of explanation systems in machine learning while advocating for further empirical evaluations tailored toward refining machine-human explanatory communications.

PDF Markdown Bookmark Chat (Pro)

Authors (6)

Menaka Narayanan (2 papers)
Emily Chen (16 papers)
Jeffrey He (4 papers)
Been Kim (54 papers)
Sam Gershman (4 papers)
Finale Doshi-Velez (134 papers)

Citations (235)

View on Semantic Scholar

How do Humans Understand Explanations from Machine Learning Systems? An Evaluation of the Human-Interpretability of Explanation (1802.00682v1)

Understanding Human Interpretability of Machine Learning Explanations

Methodological Approach

Results and Insights

Implications and Future Directions

Related Papers