Decompose and Explain (DeX) for AI Interpretability

Updated 28 December 2025

DeX is a methodology that decomposes complex tasks into explicit, interpretable sub-components using structured algorithms and modular pipelines.
It employs techniques like factual correction, adversarial GANs, and dynamic tracing to ensure fidelity and transparency across modalities.
The approach enhances model interpretability by quantifying contributions, improving accuracy, and facilitating bias detection in AI systems.

Decompose and Explain (DeX) refers to a class of frameworks, algorithms, and methodologies in artificial intelligence, machine learning, and cognitive science that perform explicit task or signal decomposition followed by structured explanation or interpretation. The common denominator across DeX systems is the transformation of a complex input, decision, or computational artifact into an explicit set of interpretable components—often structured as sub-tasks, additive parts, or hierarchical evidence—thereby enabling robust, transparent, and often quantitative explanations of model behavior or system function.

1. Foundational Principles of Decompose-and-Explain

DeX approaches fundamentally address the need to bridge the gap between model complexity and human-centered interpretability. In domains such as natural language understanding, image classification, model decompilation, counterfactual explanation, and cognitive task modeling, DeX formalizes decomposition as the generation of sub-parts—typically subquestions, components, or subgoals—and then links these discrete units to the overall system output via mechanistic or statistical explanation.

Core traits of DeX systems include:

Explicit Intermediate Representations: Mapping inputs to structured or additive sub-components whose contributions are directly interpretable.
Functional Decomposition: Decomposing functions, tasks, or decisions by leveraging data-driven, mathematical, or signal-processing principles.
Modularity and Transparency: Designing pipelines where each reasoning step (decomposition, fact-checking, entailment) is explicit and auditably separated.
Evaluation of Explanation Faithfulness: Employing metrics that directly assess if removal, modification, or analysis of each sub-component meaningfully impacts system output.

This paradigm stands in contrast to post hoc saliency or scoring approaches, prioritizing self-explanatory, structured outputs, and rigorous attribution for enhanced interpretability.

2. DeX in Natural Language Decomposition and QA Systems

A salient instantiation of the DeX paradigm is in decomposition-driven NLU and QA. In "Learning to Decompose: Hypothetical Question Decomposition Based on Comparable Texts" (Zhou et al., 2022), DecompT5 is pretrained to generate complementary facts from parallel news data, effectively learning to decompose complex queries as an intermediate pretraining task. A DeX-style downstream pipeline (DecompEntail) comprises:

Decomposition: A fine-tuned transformer generates a chain of 1–3 sub-statements essential to answer multi-hop or hypothetical queries.
Factual Correction: Each decomposition output is validated and corrected for factual fidelity by a specialized LLM such as GPT-3.
Entailment: The set of corrected statements is fused and fed into a textual entailment model, which infers the final answer.

Empirical results show substantial gains over strong baselines in both semantic parsing and QA tasks—e.g., +27.3 Hit@5 on Overnight, +32.5 pp exact match on TORQUE, +4–8% over GPT-3 on StrategyQA and HotpotQA—demonstrating the effectiveness of explicit decomposition followed by high-fidelity validation and reasoning.

This workflow demonstrates essential traits of DeX frameworks: robust decomposition, explicit modularity, and enhanced interpretability via intermediate rationales.

3. DeX Approaches in Explainable AI for Vision

DeX frameworks have produced significant advances in image explainability by moving beyond traditional pixel-wise heatmaps to additive, functionally defined decompositions. As formalized in DXAI (Kadar et al., 2023), the image $x$ is expressed as the sum of a class-agnostic component $\psi_{\mathrm{Agnostic}}$ —provably neutral to the classifier—and a class-distinct component $\psi_{\mathrm{Distinct}}$ containing exactly the discriminative content:

$x = \psi_{\mathrm{Agnostic}} + \psi_{\mathrm{Distinct}}$

The class-agnostic part is enforced to yield uniform classifier output, while the distinct part is everything needed to reconstruct the predicted class profile. The decomposition is achieved by an adversarial, multi-branch GAN trained with an explicit loss on both class neutrality and reconstruction fidelity. DeX explanations then consist of full-resolution, multi-channel images representing discriminative content, elucidating what the model "relied on" versus what was ignored.

Experimental benchmarks demonstrate DeX's superiority in faithfulness and specificity compared to gradient- or relevance-based heatmaps in domains where class cues are dense, global, or non-local (e.g., color, texture, additive noise, tumor identification), and validate that the decomposition is classifier-specific and stable across runs.

A recent extension of DeX to subjective and cross-modal tasks is exemplified in the development of semantically grounded counterfactual explanations for privacy-sensitive image classification (Baia et al., 21 Dec 2025). Here, DeX operates by:

Mapping both images and natural language concepts into a joint embedding space.
Extracting image-specific concepts (e.g., "driver's license," "hospital room") using LVLM and LLM pipelines.
Quantifying the contribution of each concept by vector arithmetic removal, checking if the classifier's decision flips.
Selecting Pareto-optimal, diverse explanatory concepts using both proximity (embedding similarity) and confidence.

This framework enables quantification of concept-level "importance," supports objective comparison to other explanation methods (e.g., CounTEX), and permits systematic identification of dataset biases through topic modeling of the explanations. The DeX method achieves high validity, sparsity, proximity, feasibility, and diversity, highlighting its utility in both individual and aggregate analysis of model decision rationales and bias sources.

5. DeX and Functional Decomposition in Model Explanation

DeX formalism also appears in global and local model explanation through functional decomposition as in (Hiabu et al., 2022). For a model $f(x)$ on $d$ features, DeX isolates:

$f(x) = \mu + \sum_{k=1}^d f_k(x_k) + \sum_{|S|>1} f_S(x_S)$

with the $f_k$ terms representing main effects and the $f_S$ terms higher-order interactions, each uniquely defined via a marginal identification constraint that integrates over subsets of features. Interventional SHAP values and partial dependence plots are then simple functionals of these decomposed components, unifying local and global interpretation. Key implications include:

Exact Computation: For low-dimensional additive models (e.g., certain ensembles, random planted forests), all $f_S$ can be exactly recovered.
Bias Removal: Direct and indirect effects of protected variables can be post hoc eliminated by zeroing $f_S$ where $S$ contains any protected variable.
Decomposed Feature Importance: DeX allows attribution of prediction variance to main effects or to specific interactions, improving over conventional SHAP or PDP methods in revealing the true dependency structure.

Empirical results demonstrate that this approach identifies higher-order and intricate dependencies overlooked by conventional techniques, and that it is computationally practical for models with small interaction order.

6. DeX in Neural Network Decompilation and Model Forensics

In the context of model decompilation, DeX principles are instantiated in the NeuroDeX pipeline (Li et al., 8 Sep 2025), which reconstructs high-level models from heavily optimized and quantized binaries via the interplay of dynamic tracing and LLM-driven code analysis. The pipeline comprises:

Operator Function Extraction: Identifying and extracting "operator functions" and memory signatures from binaries.
Operator Type Recognition & Attribute Recovery: Using LLM softmax prompts and dynamic-taint analysis to discriminate between fused and base operators, and recovering parameter settings via code analysis or forward simulation.
Graph & Parameter Reconstruction: Building directed acyclic computational graphs, recovering weights and layout transformations, and dequantizing as necessary.

NeuroDeX achieves near-lossless recovery (100% attribute accuracy and model inference accuracy) for standard, non-quantized models, and functionally similar recovery (≥72% top-1 on quantized, KL-calibrated models), significantly exceeding the performance and speed of prior systems while robustly handling a broad spectrum of compiler optimizations.

7. DeX for Hierarchical, Cognitive, and Token-Level Decomposition

DeX principles inform hierarchical or recursively-structured explanations in both vision and LLMs. In CNNs, Deeply eXplain (DeX) (Cheng et al., 2022) and gAP modules decompose decisions layer-wise with per-channel relevance, constructing explicit evidence hierarchies—from top-layer neurons to pixel-level support—yielding interpretability at multiple abstraction levels. This approach also exhibits strong alignment with ablation-based faithfulness metrics.

For transformers, DecompX (Modarressi et al., 2023) traces information flow token-wise, decomposing each token representation into unique contributions from each input token and propagating these at every layer—including the attention, FFN, normalization, and classification head—culminating in numerically faithful per-class, per-token attribution vectors without summation or norm-based aggregation. Empirical faithfulness (AOPC, accuracy under masking) is sharply improved over gradient- and vector-based competitors.

In cognitive science, DeX is formalized in resource-rational task decomposition frameworks (Correa et al., 2022), which model human subgoal selection as optimization over planning-reward trade-offs. The system identifies optimal or nearly-optimal subgoal sets for structured tasks; explanations for subgoal selection are grounded in quantified planning-cost reduction and path statistics, and the system's predictions exhibit high correlation with human subgoal choices.

These diverse but structurally unified DeX systems illustrate the generality of Decompose and Explain as both a methodology and a set of algorithmic tools, spanning modalities, architectures, and interpretative objectives. The unifying theme is explicit decomposition followed by measurable, human-interpretable explanation, often with superior empirical performance and a basis for bias detection, fairness analysis, or transparency.