The Challenge of Crafting Intelligible Intelligence (1803.04263v3)

Published 9 Mar 2018 in cs.AI

Abstract: Since AI software uses techniques like deep lookahead search and stochastic optimization of huge neural networks to fit mammoth datasets, it often results in complex behavior that is difficult for people to understand. Yet organizations are deploying AI algorithms in many mission-critical settings. To trust their behavior, we must make AI intelligible, either by using inherently interpretable models or by developing new methods for explaining and controlling otherwise overwhelmingly complex decisions using local approximation, vocabulary alignment, and interactive explanation. This paper argues that intelligibility is essential, surveys recent work on building such systems, and highlights key directions for research.

Citations (228)

View on Semantic Scholar

Summary

The paper finds that inherently interpretable models like GA2M can achieve competitive accuracy while making AI decisions more understandable.
It examines post-hoc techniques such as LIME, which provide local, case-specific explanations for opaque deep neural networks.
The study advocates for interactive explanation systems that tailor AI explanations to boost trust and meet regulatory standards.

Intelligibility in Artificial Intelligence Systems: Current Approaches and Future Directions

The paper "The Challenge of Crafting Intelligible Intelligence" by Daniel S. Weld and Gagan Bansal addresses a critical issue in the deployment of AI systems: their intelligibility. As AI is increasingly employed in mission-critical environments, understanding its decision-making processes becomes crucial. The paper provides a comprehensive survey of methods to enhance AI intelligibility and discusses the balance between performance and interpretability.

Overview of Intelligibility Challenges

AI systems, specifically those leveraging deep lookahead search and neural networks, often produce complex, opaque behaviors. This lack of transparency impedes the trust required for deployments in sensitive areas such as credit scoring, self-driving cars, and criminal justice. The paper argues that intelligibility is key for trust, safety, and legal compliance, such as in light of regulations like the European Union's GDPR, which emphasizes citizens' right to explanations of automated decisions.

Approaches to AI Intelligibility

The authors categorize approaches to enhancing AI intelligibility into two primary strategies:

Inherently Interpretable Models: These models are designed to be understandable by humans from the outset. They include linear models, decision trees, and, more significantly, Generalized Additive Models (GAMs) and their variants like GA2M. The paper uses GA2M as an exemplar due to its ability to compete with complex models in accuracy while remaining interpretable. GA2M allows visualization of individual features and pairwise interactions, which aids in model debugging and feature understandability.
Post-Hoc Explanations for Complex Models: For models that inherently lack transparency, such as deep neural networks, post-hoc explanation techniques are essential. Methods like LIME generate local approximations of complex models to provide understandable, case-specific explanations. This strategy requires developing a vocabulary for explanations, which must be semantically meaningful for human users.

Numerical Results and Prominent Challenges

The survey underscores that methods like GA2M can achieve performance competitive with more intricate models while remaining interpretable. However, their complexity scales quadratically with the number of features, presenting a limitation in big data contexts.

One prominent challenge in explaining complex models is balancing the trade-off between comprehensibility and fidelity. Algorithms like LIME and methods like joint training with LLMs attempt to create explanations faithful to the model's decisions yet understandable to users.

Future Prospects in Intelligibility Research

The paper suggests that the development of interactive explanation systems could address many current limitations. By facilitating a dialog between the AI and the user, these systems could tailor explanations to users’ needs and allow for counterfactual reasoning. Additionally, methods for mapping user feedback from explanatory models back to the original models are crucial, suggesting an ongoing synergy between human-computer interaction research and AI development.

Implications and Conclusion

The implications of research into AI intelligibility are profound, not only in enhancing user trust and safety but also in improving the AI's utility for human collaborators. Given the interdisciplinary nature required to effectively communicate AI processes to humans, the paper calls for a collaboration across AI, machine learning, HCI, philosophy, and psychology.

In conclusion, while methodologies to make AI systems intelligible are evolving, significant challenges remain. The need for explanatory dialog systems and improved vocabulary selection are only initial steps toward fully trustworthy AI systems. The paper is an essential read for researchers in the field, offering both a review of current methods and a roadmap for future exploration in AI intelligibility.

PDF Markdown