Causality-Based Necessary Explanations
- Causality-based necessary explanations are a rigorous framework that identifies the minimal set of features whose alteration changes an outcome.
- They leverage structural causal models and counterfactual interventions to ensure that every component in the explanation is indispensable.
- Algorithmic approaches, including brute-force search, ASP programming, and explanation trees, demonstrate practical applications in classification, decision making, and databases.
Causality-based necessary explanations form a rigorous class of model explanations rooted in structural causal models (SCMs), counterfactual analysis, and minimality principles. They formalize the demand for answers to “what must have happened for this outcome to occur?” by identifying the minimal features, variables, or input conditions whose alteration suffices to remove (or bring about) a target outcome. In contrast to association-based or purely feature-attribution methods, causality-based necessary explanations enjoy strong counterfactual guarantees: each explanation exhibits the minimal set of changes in the causal graph or model input for which the effect disappears, and deleting any explanation component breaks explanatory power. This article surveys the theoretical foundations, formal definitions, algorithmic paradigms, representative frameworks, complexity results, and methodological innovations for causality-based necessary explanations across classification, sequential decision making, knowledge representation, and database systems.
1. Theoretical Foundations: Structural Models and Counterfactual Necessity
Causality-based necessary explanations are grounded in the formalism of SCMs and the Halpern–Pearl framework for actual causality. An SCM is a tuple , where are endogenous variables, are exogenous (noise) variables, and each variable has a structural equation specifying its value given its parents. The causal semantics of necessary explanations leverage the do-operator: intervention modifies the model by setting to and propagating the effects.
The minimal necessary (actual) cause of an event in a model under context is a (minimal) set such that:
- (NC1) (factuality),
- (NC2) There exists an alternative so that (counterfactual violation under intervention),
- (NC3) Minimality: no strict subset of suffices (Künnemann, 2017, Bertossi et al., 2020).
This guarantees not only that the explanation is causally necessary for the effect but also that explanations are as concise as possible, excluding redundant factors.
2. Formal Definitions and Duality: Necessary vs. Sufficient Explanations
Formally, let be a subset of variables/features. is a necessary explanation for outcome if, for some value assignment ,
- ,
- for any strict subset , for all .
This minimality captures necessity in the sense that all components are required: removing any one prevents the outcome change (Bertossi et al., 2020, Bertossi et al., 19 Nov 2025, Künnemann, 2017). The dual notion, sufficiency, considers sets such that, fixing guarantees the effect regardless of other variables; necessary and sufficient explanations are formal duals, and minimal sufficient sets can be mechanically transformed into minimal necessary sets via Boolean dualization (Künnemann, 2017, Bertossi et al., 19 Nov 2025).
Table: Core notions of causality-based explanation
| Explanation Type | Definition | Minimality Criterion |
|---|---|---|
| Necessary | Remove outcome lost; no subset suffices | No strict subset also flips outcome |
| Sufficient | Set outcome always observed, regardless of rest | No strict subset is still sufficient |
3. Algorithmic Paradigms for Extraction
Several algorithmic strategies instantiate the formal definitions:
Brute-force and Greedy Search: For classifiers, enumerate all subsets in order of size, check if intervening on the subset suffices for outcome change, and if so, verify minimality. For high-dimensional settings, compute responsibility scores (as in (Bertossi et al., 2020, Bertossi, 2020)) and add features greedily by marginal necessity.
ASP and Logic Programming: Logic programming approaches (ASP or s(CASP)) declaratively encode the effect/decision logic, domain constraints, and minimality, then leverage solver optimization to return cardinality-minimal interventions (Bertossi, 2020, Dasgupta et al., 24 May 2024, Dasgupta et al., 11 Jul 2024). Causal literals and non-monotonic semantics (e.g., (Fandinno, 2016)) directly express necessity queries and their stable-model evaluation underlines global non-monotonic dependencies.
Explanation Trees and Information Flow: In Bayesian networks, causal explanation trees recursively grow explanations by selecting variables that maximize causal information flow toward the target evidence, continuing splitting until each assignment is non-redundant and collectively boosts the probability of the explanandum (Nielsen et al., 2012).
SCM Recursion and Symbolic Explanation: For SCMs (e.g., (Zečević et al., 2021)), explanations are recursively compiled via graph-theoretic traversal, backpropagating why-questions through only direct parent variables and composing mininal, direct causal chains.
Database Repairs and Hitting Sets: In monotone Boolean queries over relational databases, minimal necessary explanations correspond to minimal sets whose removal violates the query—computable as minimal hitting sets of the set of minimal query witnesses. This links directly to actual causality and database repairs (Bertossi et al., 19 Nov 2025).
4. Methodological Frameworks and Case Studies
Black-box Classification and Feature Attribution
Causality-based necessary explanations for black-box classifiers formalize minimal actual causes via direct counterfactual testing: finding the smallest feature set whose intervention flips the class (Bertossi et al., 2020). ASP-based methods declaratively encode the intervention space and sample minimal cardinality explanations efficiently (Bertossi, 2020). Notably, feature-importance methods such as SHAP do not provide necessity guarantees, as they permit positive attributions for features neither necessary nor sufficient for outcome change (Bertossi et al., 2020, Mothilal et al., 2020). Counterfactual-based evaluation reveals that LIME or SHAP-ranked features often fail “necessity” checks, particularly in high-dimensional settings (Mothilal et al., 2020).
Sequential Decision Making
In the context of MDPs and policies, SCMs are used to encode the complete computation of the decision policy; a necessary explanation identifies the minimal set of state factors whose change would alter the agent's action under a given trajectory prefix (Nashed et al., 2022). Efficient dynamic programming and beam search traverse the layered causal computation graph to enumerate minimal necessary reasons for agent choices.
Knowledge Representation and Reasoning
Non-monotonic logic programming with causal literals extends ASP semantics to include direct representations of necessity. Necessary-cause queries explicitly test, for an atom and event , whether all alternative justifications for collapse to uses of . Rule bodies incorporating these literals support legal, diagnostic, or elaboration-tolerant reasoning (Fandinno, 2016).
Databases and Data Provenance
Minimal necessary explanations in data management quantify for each endogenous tuple its role as a necessary cause for a query answer (or non-answer). Responsibility scores correspond to the inverse size of the minimal necessary set; these coincide with contingency-based actual causes in the Halpern–Pearl framework. Enumeration algorithms reduce to minimal hitting-set computation on the lineage/witness hypergraph, and maintain tight connections to database repairs and denial constraints (Bertossi et al., 19 Nov 2025, 0912.5340).
5. Empirical Validation and Observed Advantages
Experimental evaluation across benchmarks demonstrates that causality-based necessary explanations yield:
- Extremely concise (often singleton or pairwise) explanations for individual classification decisions in financial risk and fraud domains, outstripping brute-force searches for all but a tiny fraction of cases (Bertossi et al., 2020);
- Superior alignment with ground-truth or human-annotated rationales in review classification, medical claims, and OOD settings; critical robustness to spurious associations (Zhang et al., 2023);
- Structured, human-readable explanation chains in SCMs, matching human intuition in user studies and serving as regularizers for graph learning (Zečević et al., 2021);
- Consistency and robustness under counterfactual intervention and agnostic to model specification (black-box or white-box);
- Polynomial-delay computation in many database, logic, and planning settings; practical performance for moderate feature sets (Dasgupta et al., 24 May 2024, Bertossi et al., 19 Nov 2025).
6. Complexity, Limitations, and Extensions
The decision problem “Is a necessary explanation for ?” is coNP-complete in general classifiers, and NP-hard for minimum-explanation search due to the need for minimality testing and hypergraph hitting-set enumeration (Bertossi et al., 2020, Bertossi et al., 19 Nov 2025). Specialized cases (tree-shaped causal networks, safe conjunctive queries, or stratified logic programs) admit polynomial-time solutions (0912.5340, Bertossi et al., 19 Nov 2025). Key practical limitations include scaling to high-dimensional or continuous domains, and dealing with latent confounding or non-discrete causal graphs (Dasgupta et al., 24 May 2024, Dasgupta et al., 11 Jul 2024). Open research seeks efficient approximations, causal pruning, and robust integration with learned or partially specified causal structures.
7. Connections, Duality, and Open Horizons
The duality of necessary and sufficient explanations—precise in the sense of propositional dualization—enables interconversion without recomputing counterfactuals (Künnemann, 2017, Bertossi et al., 19 Nov 2025). This underpins the construction of lean explanations and the transfer of algorithmic machinery between the two domains. In practice, causality-based necessary explanations are increasingly embedded into goal-directed ASP planning, deep-learning saliency mechanisms (e.g., SUNY (Xuan et al., 2023)), interpretable rationalization, database systems, and XAI pipelines for critical decision support.
Future work is poised to enhance tractability for large-scale systems, extend necessity guarantees to probabilistic and sequential contexts, and further refine necessity metrics through human-in-the-loop evaluation and causal abstraction.