Reliable explainability for current AI models

Develop robust, reliable methods to properly explain the decisions of current artificial intelligence systems—especially deep learning and large language models—beyond existing post-hoc interpretability techniques whose outputs can be inconsistent or misleading.

Background

After reviewing the limitations of existing explainability approaches, the text highlights persistent opacity in deep learning–based systems and the inadequacy of current post-hoc methods for trustworthy explanations in critical domains.

The authors explicitly state that we still do not know how to properly explain the decisions of contemporary AI models, framing this as an ongoing, unresolved challenge for interpretability and trust.

References

In practice, this research confirms that we do not know how to properly explain the decisions of current AIs .

— The Impact of Artificial Intelligence on Human Thought (2508.16628 - Gesnot, 15 Aug 2025) in Chapter 6: "Black Box" AI and the Hypothesis of an Orchestrating Consciousness, Explainability and Trust in AI

Reliable explainability for current AI models

Sponsor

Background

References

Related Problems