Papers
Topics
Authors
Recent
2000 character limit reached

Explanatory View Hypothesis

Updated 6 November 2025
  • Explanatory View Hypothesis is a framework that defines explanation as a recipient-centered process focused on updating understanding through communicative effectiveness.
  • It emphasizes evaluative criteria such as accuracy, simplicity, and causal adequacy, often formalized using Bayesian metrics to assess explanatory virtues.
  • The hypothesis finds practical application in explainable AI, scientific theory selection, and public engagement by tailoring explanations to specific epistemic needs.

The Explanatory View Hypothesis

The Explanatory View Hypothesis (EVH) is a foundational stance in philosophy of science, AI, cognitive science, and related disciplines, positing that the core function of explanation is to effect a meaningful update in an explainee’s (recipient’s) understanding, and that explanations should be judged, selected, and evaluated on the basis of their explanatory virtues and communicative effectiveness. This hypothesis entails a shift from purely system- or model-centered perspectives on explanation to explicitly recipient- or context-centric frameworks, emphasizing explanation as a communicative, pragmatic, and normatively evaluable act. The EVH has been formalized in diverse domains, with technical elaborations in AI, computational models of cognition, philosophy of science, and democratic theory.

1. Conceptual Foundations and Core Definitions

At its core, the Explanatory View Hypothesis asserts that the central normative standard for explanation is its success in producing understanding in the recipient. In the tradition of philosophy of science, this is often expressed in terms such as:

"A 'good' explanation is one that maximizes (or optimizes, to a sufficient degree) a set of explanatory virtues, given the context of inquiry" (Zukerman, 22 Nov 2024).

Within AI research, the EVH is operationalized through metrics evaluating the effect of an explanation on an agent's internal state. For example, (Cope et al., 2023) formalizes this via the explanatory effectiveness, defined as the change in the explainee’s understanding of a phenomenon pp after explanatory interaction: Effectiveness(oB,p)=U(Bτ,p)U(B1,p)\text{Effectiveness}(\mathbf{o}_B, p) = U(B^\tau, p) - U(B^1, p) where U(X,p)U(X, p) denotes the explainee XX's measured understanding of pp, and oB\mathbf{o}_B encodes the explanatory interventions.

Complementary definitions are articulated through frameworks such as model explainability in NLP (Deemter, 2022), causal model-based pragmatic accounts of explanation as communicative (per the Rational Speech Acts model) (Harding et al., 6 May 2025), and formal Bayesian decompositions of explanatory values (Wojtowicz et al., 2020).

2. Explanatory Virtues, Values, and Bayesian Formalization

Central to the EVH is the notion that explanations are to be evaluated by their explanatory virtues (EVs)—meta-theoretical properties or criteria such as accuracy, simplicity, unification, coherence, coverage, depth, and causal adequacy (Zukerman, 22 Nov 2024). These virtues are deeply embedded in both abductive inference (inference to the best explanation) and in the technical evaluation of AI systems (Deemter, 2022, Wojtowicz et al., 2020).

Recent technical work anchors these virtues in Bayesian formalism. As explicated in (Wojtowicz et al., 2020), explanatory values map onto distinct mathematical terms in the Bayesian posterior: logp(Ex)=i=1nlogp(xiE)+log(p(xE)ip(xiE))+ilogTi(E)+logπ(E)\log p(E \mid x) = \sum_{i=1}^n \log p(x_i \mid E) + \log \left( \frac{p(x \mid E)}{\prod_i p(x_i \mid E)} \right) + \sum_i \log T_i(E) + \log \pi(E) where the terms correspond to empirically assessed values (descriptiveness, co-explanation), theoretical virtues (power, unification, simplicity), and priors. This decomposition aligns with psychological evidence about human explanatory preferences and supports interpretation of anomalous inference patterns (e.g., conspiracy reasoning) as virtue misweighting.

In AI and statistical modeling, related metrics include the Kolmogorov complexity of explanations (parsimony) and the mutual information or causal relevance furnished by explanatory acts (Cope et al., 2023, Chajewska et al., 2013).

3. Communicative and Pragmatic Accounts

A major development under the Explanatory View Hypothesis is the shift to communication-first or recipient-centric models of explanation. In this paradigm, explanation is not merely the delivery of truth, but a conversational, goal-oriented exchange—optimized for the recipient's epistemic state, decision problem, and downstream use (Cope et al., 2023, Harding et al., 6 May 2025).

For instance, (Harding et al., 6 May 2025) employs a Rational Speech Acts–inspired framework in which the "goodness" of an explanation is the improvement in the listener’s decision-theoretic utility, conditioned on the listener’s prior state and practical goals. This enables formal prediction of audience-adaptive effects such as minimality, proportionality, and contextual relevance, thus grounding explanatory virtues in conversational pragmatics.

In cognitive science, similar ideas explain belief attribution and explanation selection by reference to communicative (or causal) informativeness (2505.19376), showing empirically that humans credit beliefs not just by accuracy but by their causal relevance to observed actions.

4. Explanatory Unification and Scientific Theory Choice

Philosophy of science emphasizes explanatory unification as a key explanatory virtue—preferring theories that account for diverse phenomena through a compact, integrated framework (Allzén, 18 Dec 2024). The EVH motivates the use of unification, scope, and consilience as rational criteria for theory choice, particularly when direct empirical confirmation is limited. However, recent work critiques unification as a non-sufficient ground for belief, warning that unification must be truth-conducive and methodologically rigorous to be epistemically justified (Allzén, 18 Dec 2024).

Fine-tuning and explanatory depth are additional virtues considered essential for discriminating among theories. Robust explanations that do not rely on "just-so" parameters are privileged, as formally captured in mathematical schemas relating fine-tuning measures to explanatory depth (Azhar et al., 2019).

5. Application in AI and Explainable Systems

In AI, the EVH fundamentally reorients explainable AI (XAI) research toward recipient-centered and functionally evaluable standards. Explanations must not merely reveal internal model mechanisms ("black-box-to-white-box" transition (Ayonrinde et al., 1 May 2025)); they must contribute to human understanding, utility, or trust. Mechanistic Interpretability, as defined in (Ayonrinde et al., 1 May 2025), requires explanations to be model-level, ontic, causal-mechanistic, and falsifiable, with faithfulness to internal model dynamics.

Empirical work demonstrates the practical implications:

  • Explanatory Instructions in computer vision induce greater generalization, supporting the EVH in multimodal AI (Shen et al., 24 Dec 2024).
  • Probabilistic models of explanation in neural networks leverage "reasoning paradigms" (correlation, counterfactual, contrastive) to achieve completeness and actionable justification (AlRegib et al., 2022).
  • Democratic rationales for explanation stress the legitimacy and right to explanation in algorithmic societies, highlighting the need for "explanatory publics" (Berry, 2023).

6. Challenges, Limitations, and Critiques

Despite its broad appeal, the EVH faces substantive challenges:

  • Operationalizing "understanding" is nontrivial; highly formal measures may not always track human cognitive states (Cope et al., 2023).
  • Explanatory virtues may conflict; e.g., simplicity can trade off with completeness or depth (Zukerman, 22 Nov 2024, Azhar et al., 2019).
  • Surrogate models and templates may fail to capture the actual reasoning pathways of complex models (e.g., GPT-3.5), highlighting the need for more faithful or nuanced frameworks (Braun et al., 7 Feb 2024).
  • Overreliance on non-empirical virtues (e.g., unification in dark matter debate) may decouple explanation from empirical testability (Allzén, 18 Dec 2024).

A plausible implication is that future research must refine measures of communicative and cognitive success, develop multi-attribute optimization of explanatory values, and maintain methodological rigor (especially when explanatory arguments substitute for direct evidence).


Selected Technical and Formal Definitions

Concept Formalization/Definition
Explanatory Effectiveness Effectiveness(oB,p)=U(Bτ,p)U(B1,p)\text{Effectiveness}(\mathbf{o}_B, p) = U(B^\tau, p) - U(B^1, p)
Understanding (AIT) U(X,p)=U^(X,p)c(X,p)ϕ(X,p)Υp(X)I(zX;p)K(p)U(X, p) = \frac{\hat U(X, p) \cdot c(X, p) \cdot \phi(X, p) \cdot \Upsilon_p(X) \cdot I(z_X;p)}{K(p)}
Bayesian Explanatory Value logp(Ex)=ilogp(xiE)+\log p(E \mid x) = \sum_{i} \log p(x_i|E) + \ldots (see (Wojtowicz et al., 2020) for details)
Fine-Tuning/Explanatory Depth DE(O;p)=1i=1n[1+Gi(O;p)]D_E(\vec{O}; \bm{p}') = \frac{1}{ \prod_{i=1}^n [1 + \mathcal{G}_i(\vec{O}; \bm{p}')] } (Azhar et al., 2019)
Explanatory Faithfulness Intermediate activations of explanation EE match those of model MM (Ayonrinde et al., 1 May 2025)
Partial Order of Explanations X1EX2    EP(X1,E)EP(X2,E),PrE(X1)PrE(X2)X_1 \succeq_E X_2 \iff EP(X_1, E) \geq EP(X_2, E), \Pr_E(X_1) \geq \Pr_E(X_2) (Chajewska et al., 2013)

References to Key Papers

  • (Cope et al., 2023): Formal measure of explainee-centric explanatory effectiveness (information-theoretic/AIT).
  • (Harding et al., 6 May 2025): Communication-first pragmatic framework and formal model of explanation as conversational act.
  • (Deemter, 2022): Explanatory value in NLP—distinction from model explainability.
  • (Zukerman, 22 Nov 2024): Survey of explanatory virtues and relationship to XAI, formalization of explanation value.
  • (Azhar et al., 2019): Quantitative link between fine-tuning and explanatory depth.
  • (Wojtowicz et al., 2020): Bayesian decomposition of explanatory values and empirical grounding.
  • (Shen et al., 24 Dec 2024): Support for explanatory representations driving zero-shot generalization in CV.
  • (Ayonrinde et al., 1 May 2025): Explanatory View Hypothesis in Mechanistic Interpretability.
  • (Chajewska et al., 2013): Probabilistic and causal criteria for valid and better explanations in AI systems.
  • (Berry, 2023): Implications for democratic legitimacy and explanatory publics.
  • (Allzén, 18 Dec 2024): Critical analysis of explanatory unification as epistemic warrant.
  • (Braun et al., 7 Feb 2024): Limitations of hypothesis-driven surrogate models for LLM explanations.
  • (AlRegib et al., 2022): Observed explanatory paradigms; completeness via correlation, counterfactual, and contrastive reasoning.
  • (2505.19376): Computational model of belief attribution as mental explanation.

The Explanatory View Hypothesis provides a framework that bridges formal, empirical, and practical dimensions of explanation—enabling rigorous assessment in fields ranging from AI and model interpretability to scientific theory choice and public justification. Ongoing research aims to refine these frameworks to optimize recipient understanding, explanatory virtues, and functional impact across real-world contexts.

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Explanatory View Hypothesis.