Question Decomposition Strategy
- Question Decomposition Strategy is a method that splits complex, multi-step questions into simpler parts, improving processing efficiency and accuracy in QA systems.
- It employs various techniques such as explicit, neural, and graph-based decomposition to isolate critical reasoning steps and enhance interpretability.
- Its modular design enables effective answer composition across modalities, yielding measurable performance gains in diverse QA benchmarks.
A question decomposition strategy is a set of algorithmic and architectural techniques that divides a complex, multi-step, or compositional question into simpler sub-units—typically simpler sub-questions or structured fields—whose answers or representations can be more efficiently and reliably processed, and then recombined to obtain the final answer. Such strategies play a pivotal role in question answering (QA) systems that must resolve reasoning paths across unstructured text, structured knowledge bases, or multimodal data. Question decomposition enables these systems to address compositionality, improve interpretability, boost faithfulness of explanation, and in certain data regimes, substantially improve accuracy.
1. Principles and Formal Definitions
A question decomposition strategy always involves (at least) two mappings:
- Decomposition: A function (or sequence transducer) that takes a complex input question and outputs a sequence or graph-structured set of simpler units (e.g., for subquestions, or a tree/graph for hierarchical decompositions).
- Composition: A second function that aggregates the answers (or intermediate representations) from sub-units back into a final answer (typically ).
Depending on the setting and modality, decomposition may be explicit (human annotation, model output as text, or structural representation) or implicit (latent fields, operator tokens, attention distributions). Some representative formalizations include:
- Explicit decomposition: , with (Wei et al., 2022).
- Field-based structural decomposition: Each question and document is represented as aligned fields , , scored via per-field retrieval and weighted sum aggregation (Jurczyk et al., 2016).
- Hierarchical/graph/tree-based decompositions: E.g., question decomposition trees (QDTs) (Huang et al., 2023), hierarchical question decomposition trees (HQDTs) (Zhang et al., 2023).
- Mixed-modality extensions: Visual question decomposition with sub-question generation and selective composition (Zhang et al., 2024, Khan et al., 2023).
2. Algorithmic and Model Strategies
Contemporary decomposition methodologies can be categorized broadly as follows:
A. Structural/Linguistic Decomposition
Deploying NLP techniques (POS, dependency, SRL, coreference) to map questions into fields or mini-documents that are then indexed, scored, and recomposed using statistical or search-based frameworks (Jurczyk et al., 2016). Each field corresponds to a bag-of-terms (lexical, syntactic, semantic), and retrieval incorporates per-field TF-IDF scoring, weighted by learnable coefficients.
B. Learned Neural Decomposition
Trainable neural modules (pointer networks, seq2seq LMs, BART/T5/Flan-T5-based decomposers) generate either explicit sub-questions (as text) or intermediate operator chains. Learning may be fully supervised (Guo et al., 2022, Min et al., 2019), weakly supervised via hard-EM or unsupervised round-trip objectives (Perez et al., 2020), or involve distant/pretrain signal mining from comparable texts (Zhou et al., 2022).
C. Graph/AMR-Based Decomposition
Mapping questions to semantic (AMR) or logical (SPARQL, SQL, QPL) graphs, then partitioned via graph segmentation or operator mapping to produce sub-questions tied to entities or compositional structures. Subsequent sub-question generation uses graph-to-text models (Deng et al., 2022, Eyal et al., 2023).
D. Hierarchical and Tree-Based Methods
Modeling decompositions as trees (e.g., QDT, HQDT) that organize sub-questions via composition, conjunction, and constraints. Two-stage neural architectures (“Clue-Decipher,” BART-based decomposers) first hypothesize separator placements and then align to the question, ensuring coverage of all original tokens and compositional types (Huang et al., 2023, Zhang et al., 2023).
E. Prompt-based, In-Context and Retrieval-Augmented Decomposition
Zero- or few-shot in-context learning to induce decomposition ability from existing datasets or tasks (e.g., ICAT via Frechet Term-Embedding Distance for exemplar selection (V et al., 2023)). Retrieval-augmented pipelines intertwine LLM-driven decomposition with document or passage retrieval, followed by answer composition after cross-encoder reranking (Ammann et al., 1 Jul 2025).
F. Reward-Shaped and Selective Decomposition
Fine-tuning LLMs to pose only those sub-questions that provably improve downstream QA performance, with reward functions based on answer flipping and preservation (Wang et al., 1 Oct 2025). Selective decomposition mechanisms decide, per question, whether decomposition is needed, achieving substantial improvements in multimodal and medical VQA (Zhang et al., 2024, Khan et al., 2023).
3. Applications Across QA Modalities
The decomposition paradigm is highly modular, enabling its application across textual, tabular, visual, and knowledge-base QA.
- Open-Domain and Multi-Hop Textual QA: Decomposition recasts multi-hop queries into factored chains, generally producing large gains in supporting-fact retrieval, multi-hop reasoning, and end-to-end F1 under low- and medium-data regimes (Min et al., 2019, Wei et al., 2022, Guo et al., 2022, Perez et al., 2020).
- Knowledge Base QA (KBQA): Tree-based and QPL decomposition strategies map complex compositions into sequences of SPARQL or SQL program steps, often yielding performance boosts and improved interpretability (Huang et al., 2023, Eyal et al., 2023).
- Quantitative Reasoning / Financial QA: Reward-pruned, domain-specific decomposition can focus LLM attention on a single crucial supporting sub-question (average 1.2 per QA) and surpass long CoT chains, both in accuracy and inference efficiency (Wang et al., 1 Oct 2025).
- Multimodal / Visual QA: Visual QA models benefit from decomposing high-level, semantic questions into a blend of low-level perceptual sub-questions and conceptual reasoning steps; selective decomposition policies further increase SOTA accuracy, especially in domain-specialized and low-data regimes (Zhang et al., 2024, Khan et al., 2023).
4. Effectiveness, Empirical Results, and Limitations
Question decomposition can dramatically improve several key metrics, often dependent on the regime:
| Setting/Strategy | Typical Gain | Notes |
|---|---|---|
| Structural fields (bAbI, all λ) | +>40 pp MAP | Multi-field, learned weights (Jurczyk et al., 2016) |
| Hard compositional RC (DROP, DecompRC) | +6–7 F1 | Supervised pointer-based (Min et al., 2019) |
| Weakly/unsupervised multi-hop (ONUS, QDAMR) | +2–8 F1 | No gold decompositions (Perez et al., 2020, Deng et al., 2022) |
| RAG multi-hop retrieval (QD+RR) | +11.6% F1 / +36% MRR | Zero-shot, passage-level (Ammann et al., 1 Jul 2025) |
| Financial QA (EQD) | +0.6–10.5% EM | Only 1–1.2 sub-questions generated (Wang et al., 1 Oct 2025) |
| Zero-shot VQA, selective-decomp. | +6–26 pp acc. | Largest gains in medical/art tasks (Khan et al., 2023) |
Major considerations and limitations identified:
- Decomposition improves performance most in low-data or high-compositionality regimes. As labeled data increases, end-to-end seq2seq models can implicitly learn decompositions through attention and hidden representations (Wei et al., 2022).
- Over-decomposition (producing many sub-questions when a single supporting fact suffices) may harm performance due to error propagation or retrieval overhead (Wang et al., 1 Oct 2025, Ammann et al., 1 Jul 2025).
- Faithfulness improves substantially: factored decomposition can force LLMs to actually condition the answer on sub-answers, reducing “post-hoc rationalization” and hidden sycophancy compared to Chain-of-Thought (Radhakrishnan et al., 2023).
- Quality of sub-question generation is critical; annotation or data refinement on repetition, groundedness, and relevance is central in multimodal settings (Zhang et al., 2024).
- Error propagation and ill-posed sub-questions remain a concern in explicit explicit or unsupervised chains (Wei et al., 2022, Patel et al., 2022).
- For knowledge-integration tasks, hierarchical structures (trees) provide more robust fusion of heterogeneous sources than flat lists (Zhang et al., 2023).
5. Interpretability, Faithfulness, and Explainability
Decomposition strategies provide intrinsic interpretability—each sub-question constitutes an explicit, human-auditable reasoning step. This property underpins both user trust and fault localization in complex QA systems.
- Factored and field-based decompositions expose intermediate structures (fields, trees, graphs) that clarify decision provenance (Jurczyk et al., 2016, Huang et al., 2023, Zhang et al., 2023).
- Explicit answer composition footsteps (e.g., in RoHT/HQDT) allow probabilistic merging of KB-supported, text-supported, and recursively derived answers, with global likelihoods available for each branch (Zhang et al., 2023).
- Faithfulness metrics—including truncation sensitivity, corruption sensitivity, and bias-induced accuracy drop—objectively quantify how tightly final predictions depend on generated reasoning, highlighting the superiority of decomposition over CoT on these axes (Radhakrishnan et al., 2023).
6. Practical Guidance and Future Directions
- When Decomposition Helps: Most beneficial in few-shot, zero-shot, or cross-domain adaptation settings; for difficult, compositional, or multi-hop questions; and for retrieval-augmented or knowledge-integration tasks (Wei et al., 2022, Ammann et al., 1 Jul 2025).
- Modularity and Transfer: Decomposers can be trained or selected via in-context transfer (e.g., ICAT), compositional data augmentation, or reward shaping, with plug-and-play architectures in most modern LLM or RAG pipelines (V et al., 2023, Wang et al., 1 Oct 2025, Ammann et al., 1 Jul 2025).
- Extension to New Modalities: Finetuned multimodal models (VQD) and multimodal knowledge integration (HQDT, selective decomposition) represent state-of-the-art modular designs for explainable VQA and XQA (Zhang et al., 2024, Zhang et al., 2023).
- Limitations and Cautions: Careful calibration is required to avoid over-decomposition and error cascades; joint training and dynamic decomposition length selection are open areas for research. Sub-question generation quality, particularly in highly compositional or domain-specific settings, is pivotal.
A plausible implication is that as LLM reasoning abilities improve and implicit decomposition becomes more salient in deep models, the value of explicit decomposition may further shift toward interpretability, reliability, and modularity, rather than pure accuracy, especially in mission-critical or high-risk domains.
References:
- (Jurczyk et al., 2016) Multi-Field Structural Decomposition for Question Answering
- (Wang et al., 1 Oct 2025) One More Question is Enough, Expert Question Decomposition (EQD) Model for Domain Quantitative Reasoning
- (Wei et al., 2022) When Do Decompositions Help for Machine Reading?
- (Perez et al., 2020) Unsupervised Question Decomposition for Question Answering
- (Guo et al., 2022) Complex Reading Comprehension Through Question Decomposition
- (Huang et al., 2023) Question Decomposition Tree for Answering Complex Questions over Knowledge Bases
- (Zhang et al., 2023) Reasoning over Hierarchical Question Decomposition Tree for Explainable Question Answering
- (Ammann et al., 1 Jul 2025) Question Decomposition for Retrieval-Augmented Generation
- (Eyal et al., 2023) Semantic Decomposition of Question and SQL for Text-to-SQL Parsing
- (Zhang et al., 2024) Visual Question Decomposition on Multimodal LLMs
- (Khan et al., 2023) Exploring Question Decomposition for Zero-Shot VQA
- (Zhou et al., 2022) Learning to Decompose: Hypothetical Question Decomposition Based on Comparable Texts
- (V et al., 2023) In-Context Ability Transfer for Question Decomposition in Complex QA
- (Min et al., 2019) Multi-hop Reading Comprehension through Question Decomposition and Rescoring
- (Radhakrishnan et al., 2023) Question Decomposition Improves the Faithfulness of Model-Generated Reasoning
- (Patel et al., 2022) Is a Question Decomposition Unit All We Need?
- (Deng et al., 2022) Interpretable AMR-Based Question Decomposition for Multi-hop Question Answering
- (Cao et al., 2021) Coarse-grained decomposition and fine-grained interaction for multi-hop question answering