Explanation in Human-AI Systems: A Literature Meta-Review, Synopsis of Key Ideas and Publications, and Bibliography for Explainable AI

Published 5 Feb 2019 in cs.AI | (1902.01876v1)

Abstract: This is an integrative review that address the question, "What makes for a good explanation?" with reference to AI systems. Pertinent literatures are vast. Thus, this review is necessarily selective. That said, most of the key concepts and issues are expressed in this Report. The Report encapsulates the history of computer science efforts to create systems that explain and instruct (intelligent tutoring systems and expert systems). The Report expresses the explainability issues and challenges in modern AI, and presents capsule views of the leading psychological theories of explanation. Certain articles stand out by virtue of their particular relevance to XAI, and their methods, results, and key points are highlighted. It is recommended that AI/XAI researchers be encouraged to include in their research reports fuller details on their empirical or experimental methods, in the fashion of experimental psychology research reports: details on Participants, Instructions, Procedures, Tasks, Dependent Variables (operational definitions of the measures and metrics), Independent Variables (conditions), and Control Conditions.

Abstract PDF Upgrade to Chat

Citations (269)

View on Semantic Scholar

Summary

The paper synthesizes extensive literature on AI explanations, clarifying the distinction between explanation and justification.
It details empirical evaluation techniques using metrics like mental model development and user performance to gauge explanation quality.
The review advocates for dynamic, interactive explanatory models that foster trust and improve human-AI collaboration.

An Integrative Review of Explainable AI: Insights and Implications

The comprehensive review paper titled "Explanation in Human-AI Systems: A Literature Meta-Review" produced under the auspices of the DARPA XAI Program, delivers a detailed investigation into the domain of explainable artificial intelligence (XAI), addressing critical aspects associated with the need for explainability in AI systems. This review synthesizes a multitude of studies and suggests methods to enhance AI systems' interpretability, emphasizing the significance for both experts and end-users.

Primary Objectives and Scope

The paper deliberates on what constitutes a good explanation within AI systems, a topic of immense importance in contemporary AI research due to growing concerns about the interpretability of complex models like Deep Neural Networks. This review outlines the historical evolution from early expert systems to modern machine learning frameworks, highlighting the increasing complexity and consequential opaqueness inherent in current AI technologies. The scope of the review spans multiple disciplines, including computer science, philosophy, psychology, and human factors, providing a holistic view of how explanation requirements permeate these fields.

Distinction Between Explanation and Justification

Central to the review is a precise delineation between explanation and justification within AI contexts. While explanations detail the processes and mechanisms leading to an AI's decision, justifications typically provide the rationale behind the AI's adopted approach, essential for engendering user trust and understanding. The paper posits that effective explanations should ideally guide users to a better mental model of the AI system, enhancing their ability to predict system outputs under various conditions.

Empirical Evaluation and Practical Implications

The review strongly advocates for the empirical assessment of AI explanations, suggesting frameworks analogous to those employed in experimental psychology. Such evaluations should involve metrics related to explanation "goodness," user performance, mental model development, and trust—factors crucial for the successful integration of AI into decision-making processes. By adopting these metrics, AI systems can achieve a balance between technical accuracy and user comprehensiveness, ultimately fostering greater user confidence and reliance on AI systems.

Future Directions

Looking forward, the review underscores the necessity of developing global and local explanations, making clear distinctions between overarching system functionalities and specific instance-based explanations. It recommends a focus on designing AI systems that facilitate interactive and exploratory user engagement, nurturing a co-adaptive human-machine relationship. Furthermore, considering the progressive learning nature of modern AI systems, emphasis is placed on the need for dynamic explanatory models that evolve with the AI's operational proficiency.

Conclusion

The reviewed paper presents a thorough investigation into the facets of explainable AI, grounded in rigorous analysis and broad cross-disciplinary insights. It clarifies the paths through which AI systems can be made more transparent, accountable, and aligned with human cognitive frameworks. As AI technologies continue to integrate into various domains, fostering mutual understanding between AI systems and humans remains critical. The findings and recommendations provided by this review offer significant guidance for future research and development aimed at achieving effective XAI systems. Such endeavors will be instrumental in bridging the gap between AI capabilities and user expectations, ensuring ethical and responsible AI deployment.

Markdown