Local Rule-Based Explanations of Black Box Decision Systems (1805.10820v1)

Published 28 May 2018 in cs.AI

Abstract: The recent years have witnessed the rise of accurate but obscure decision systems which hide the logic of their internal decision processes to the users. The lack of explanations for the decisions of black box systems is a key ethical issue, and a limitation to the adoption of machine learning components in socially sensitive and safety-critical contexts. %Therefore, we need explanations that reveals the reasons why a predictor takes a certain decision. In this paper we focus on the problem of black box outcome explanation, i.e., explaining the reasons of the decision taken on a specific instance. We propose LORE, an agnostic method able to provide interpretable and faithful explanations. LORE first leans a local interpretable predictor on a synthetic neighborhood generated by a genetic algorithm. Then it derives from the logic of the local interpretable predictor a meaningful explanation consisting of: a decision rule, which explains the reasons of the decision; and a set of counterfactual rules, suggesting the changes in the instance's features that lead to a different outcome. Wide experiments show that LORE outperforms existing methods and baselines both in the quality of explanations and in the accuracy in mimicking the black box.

PDF Abstract

Overview of Local Rule-Based Explanations of Black Box Decision Systems

The paper "Local Rule-Based Explanations of Black Box Decision Systems" by Riccardo Guidotti et al. addresses the critical issue of the opacity of decision-making processes in black box algorithms, which complicates their adoption in sensitive or regulated environments. The authors propose LORE (LOcal Rule-based Explanations), a method to generate local interpretability for black box models by providing decision explanations and counterfactuals.

Methodology

LORE stands out through its focus on interpretable predictions for individual instances rather than a global understanding of the decision system. It employs a novel combination of synthetic instance generation through a genetic algorithm, followed by the learning of decision tree models from these instances as interpretable proxies. This surrogate model aims to closely mimic the behavior of the black box over a locally relevant feature space.

Neighborhood Generation

The method begins by generating a balanced local neighborhood around the instance to be explained. Through genetic algorithms, LORE creates instances that highlight the decision boundaries of the black box, guided by two fitness functions that focus on the concordance and discordance of black box outputs for perturbed instances.

Extraction of Explanations

Once a local neighborhood is properly established, a decision tree is trained on this data. From this decision tree, LORE extracts both decision and counterfactual rules. A decision rule clarifies the conditions leading to the predicted class, while counterfactual rules provide minimal changes needed to alter the decision, offering what-if scenarios about modifications in features.

Experimental Validation

The paper offers extensive experimental validation of LORE against other methods, including comparisons to LIME and Anchor. LORE was tested across datasets featuring mixed-type features (both categorical and continuous). By leveraging genetic algorithms, LORE maintains an effective balance between exploration and exploitation in instance generation, establishing dense and informative neighborhoods crucial for capturing local decision boundaries.

Key Results and Implications

The evaluation indicates that LORE outperforms LIME and similar techniques both in predictive fidelity to the black box and the comprehensibility of explanations. LORE's decision rules lend themselves more readily to human interpretation without requiring the user to predetermine the complexity of explanations.

This research has significant implications for AI transparency, especially in ethically regulated sectors. By enhancing model interpretability, LORE contributes to bridging the gap between complex model outputs and human-understandable rationales. This can foster trust and facilitate the lawful deployment of AI systems where explainability is a regulatory requirement, such as under GDPR in Europe.

Future Directions

The paper suggests several pathways for further research. Extending LORE to domains such as images and text, integrating it with global model behaviors, and user-studies to evaluate explanation comprehensibility are promising areas for future work. More intriguing is the potential to apply this framework for bias analysis and remediation within machine learning pipelines, thereby promoting ethical AI usage.

Overall, the proposed method provides a robust framework for local explanations, representing a significant advancement in the quest for explainable AI, ultimately aligning computational efficiency with societal needs for transparency.

PDF Markdown Bookmark Chat (Pro)

Authors (6)

Riccardo Guidotti (26 papers)
Anna Monreale (19 papers)
Salvatore Ruggieri (31 papers)
Dino Pedreschi (36 papers)
Franco Turini (9 papers)
Fosca Giannotti (42 papers)

Citations (417)

View on Semantic Scholar