Understanding Human Interpretability of Machine Learning Explanations
This paper investigates the human interpretability of machine learning explanations, specifically focusing on the aspect of verification. The crux of the research lies in evaluating the factors that influence human capabilities in understanding these explanations effectively. By conducting a series of user studies, the authors meticulously discern which components of complexity most significantly impact the time it takes for humans to verify a machine learning prediction's rationale.
The motivation behind interpretable machine learning is rooted in fostering trust and ensuring the safety of outputs by providing justifiable explanations behind predictions. Despite the plethora of explanation methods proposed, there remains a lack of systematic understanding regarding the implementation of various explanation types and their limitations in human reasoning contexts.
Methodological Approach
The research prioritizes intrinsic verification tasks over extrinsic goal-oriented tasks to mitigate issues such as environmental understanding variance among participants. The paper employs decision sets as the explanation form, given their ease of human processing for specific instances. This choice aligns with prior findings on rule sets’ interpretability.
A pivotal focus is on decision set characteristics that affect human processing speed and accuracy, encompassing factors such as:
- Explanation Size: Variations in the number of lines and complexity within each line (number of terms).
- Cognitive Chunks: The introduction and explicit representation of new concepts versus implicit integration.
- Variable Repetitions: The frequency with which terms are reused within an explanation.
Two distinct domains—recipe recommendations and clinical decision support—are utilized to examine if the contextual change influences interpretability given controlled identical conditions across domains.
Results and Insights
The paper's primary results indicate:
- A direct correlation between increased explanation complexity and higher response times/lower satisfactions. Complexity factors include both the number of lines and term repetitions.
- Counterintuitively: Making new cognitive chunks explicit, which theoretically should aid understanding, resulted in increased response times and dissatisfaction, potentially suggesting an overload in searching through excessive explanation lines.
- Variations in explanation complexity exhibited negligible effects on accuracy, suggesting complexity primarily influences required cognitive effort rather than correctness in verification.
Implications and Future Directions
From a practical perspective, these findings suggest that interpretability-focused designs should be cautious about complexity, especially regarding decision set size and explicit new concepts. Theoretical implications highlight the necessity for human cognitive load consideration in developing interpretable models.
Future explorations could delve into understanding universal interpretability principles transcending specific explanation formats or tasks and assessing cognitive preferences for concept definitions across various model dimensionalities. Furthermore, differentiating tasks that shift cognition toward detailed scrutiny versus rapid decision-making may refine the interaction between machine explanation designs and human reasoning processes.
This paper forges a foundational step in quantitatively mapping the intricacies of explanation systems in machine learning while advocating for further empirical evaluations tailored toward refining machine-human explanatory communications.