Hierarchical Fusion in HF-RAG

Updated 1 December 2025

Hierarchical Fusion in HF-RAG is an approach that integrates evidence from heterogeneous data sources like code, tables, and graphs, enabling multi-level contextual integration.
The framework employs both intra-source fusion (using techniques such as CombSUM, CombMNZ, and RRF) and inter-source score normalization to merge diverse retrieval outputs effectively.
Empirical studies show that hierarchical fusion leads to significant improvements in metrics such as EM, F1, and Plan Correctness across domains including code completion, QA, and robotic planning.

Hierarchical Fusion in HF-RAG refers to the integration of evidence and context from heterogeneous sources and structures in Retrieval-Augmented Generation frameworks, leveraging both architectural and score-normalization mechanisms. The approach is motivated by limitations of naive context concatenation and unimodal retrieval, which fail to exploit cross-level dependencies, compatibility among sources, and structural semantics across code, tabular, or graph-based domains. HF-RAG implementations span multi-source document retrieval (Santra et al., 2 Sep 2025), multi-level code and graph fusion (Wang et al., 7 Sep 2025), hybrid tabular-text QA (Zhang et al., 13 Apr 2025), and neuro-symbolic planning (Cornelio et al., 6 Apr 2025), each exemplifying domain-specific hierarchies and fusion operations.

1. Hierarchical Data Structures: Code, Tables, Graphs

HF-RAG methods assume that context is hierarchically organized—by type, scope, or semantic level. In code repositories, GRACE (Wang et al., 7 Sep 2025) models $\mathcal G=(\mathcal V,\mathcal E,\mathcal T_v,\mathcal T_e)$ as a three-tier heterogeneous graph comprising:

Repository-level: Folder structure, cross-file dependency.
Module-level: Function call graphs, type dependencies, class inheritance.
Function-level: ASTs, control-flow graphs, data-flow graphs.

Nodes bear attributes such as code snippets, types, and location, and edges are semantically typed (e.g., “call,” “implement”).

For tabular documents, HD-RAG (Zhang et al., 13 Apr 2025) introduces a row-and-column-level (RCL) schema for hierarchical tables:

Cells are indexed via multi-level header paths $P_l(i)$ and $P_t(j)$ .
Summaries are generated in two forms: “General RCL” (single-layer flattening) and “H-RCL” (multi-level contextualization by path).

In planning tasks, knowledge graphs serve as hierarchical world-state representations (Cornelio et al., 6 Apr 2025), decomposed into subgraphs relevant for macro and atomic actions.

2. Hierarchical Fusion Mechanisms and Mathematical Frameworks

Fusion in HF-RAG occurs both within sources (intra-source) and between sources (inter-source):

Intra-Source Fusion:

HF-RAG (Santra et al., 2 Sep 2025) aggregates ranked outputs from multiple IR models using schemes such as CombSUM, CombMNZ, and Reciprocal Rank Fusion (RRF):

CombSUM: $s_{\mathrm{CS}}(d) = \sum_{i=1}^m s_i(d)$
CombMNZ: $s_{\mathrm{MNZ}}(d) = f(d) \sum_{i=1}^m s_i(d)$ , with $f(d)$ counting non-zero scores
RRF: $\overline{s}_C(d) = \sum_{\theta\in\Theta} \frac{1}{\text{rank}(L_k^{C,\theta}, d)}$

Inter-Source Fusion:

Scores are standardized per source via z-score normalization:

$z_C(d) = \frac{\overline{s}_C(d) - \mu_C}{\sigma_C}$

Final selection merges the top- $k$ documents over both sources by highest z-score.

Graph Fusion (Code and KG):

GRACE (Wang et al., 7 Sep 2025) performs node-feature and graph-structural fusion:

Node-feature fusion: $\quad H_r = \sum_{i=1}^k w_i H_i$ (reranking-score weighting), attention $A = \mathrm{softmax}(H_q H_r^T/\sqrt{d})$
Structure fusion: Cross-level edges $E_{\text{cross}} = \{(u, v)\mid A_{u,v} > \theta \wedge \text{typeMatch}(u,v)\}$ added to form $G_f = (V_q \cup V_r, E_q \cup E_r \cup E_{\text{cross}})$

Knowledge graph fusion in hierarchical planning (Cornelio et al., 6 Apr 2025) merges retrieved context with macro/atomic plan prompts, followed by symbolic validation. Hidden states are concatenated or added as $H^\ell = \sigma(W^\ell\,[h^\ell;\overline{e}^\ell]+b^\ell)$ .

3. Algorithms and Workflow

The fusion process is generally staged as follows:

Step	Code Fusion (GRACE) (Wang et al., 7 Sep 2025)	Multi-Source Fusion (HF-RAG) (Santra et al., 2 Sep 2025)
1	Parse and encode local graph	Run IR models on each source
2	Retrieve and rerank subgraphs	Intra-source rank fusion (RRF)
3	GNN encode query + candidates	Z-score normalize per source
4	Compute node-feature attention	Merge top-k by z-score across sources
5	Add cross-edges, merge duplicates	Concatenate for LLM generation
6	Serialize graph to prompt

For HD-RAG (Zhang et al., 13 Apr 2025), hierarchical table paths generate RCL summaries, which are treated as “text passages” throughout retrieval and generation—no explicit fusion network is used.

Hierarchical planning (Cornelio et al., 6 Apr 2025) recursively retrieves and fuses KG context into LLM prompts at each plan level, interleaved with symbolic validation.

4. Graph and Data Serialization for LLMs

After hierarchical fusion, context is serialized for LLM input:

GRACE produces natural-language triples describing node attributes and typed relations, e.g., FunctionNode(id123): name=foo(args); calls → FunctionNode(id456); inherits → ClassNode(id789);
HD-RAG converts hierarchical table paths into readable row/column summaries for embedding and prompt assembly.
KG-RAG fuses entity relations as prompt context at each decomposition level.

Passage-level embedding models ensure that fused outputs are in compatible latent spaces for downstream generation.

5. Empirical Gains and Ablations

Experimental studies demonstrate that hierarchical fusion delivers substantial improvements:

GRACE (Wang et al., 7 Sep 2025):

Removing graph-fusion drops EM by 5.4 pp and F1 by 6.2 pp on CrossCodeEval-Python, and similar deficits on Java, confirming fusion as critical to code completion accuracy.

HF-RAG (Santra et al., 2 Sep 2025):

Hierarchical fusion (RRF intra-source + z-score inter-source) yields macro-F1 up to 0.5744, outperforming best baselines by 0.0864 and showing marked out-of-domain generalization on SciFact (+0.05 F1).

HD-RAG (Zhang et al., 13 Apr 2025):

Outperforms separate table/text retrieval approaches in both retrieval accuracy and complex QA tasks, due to implicit fusion via hierarchical table-path summaries.

KG-RAG Planning (Cornelio et al., 6 Apr 2025):

HVR (full hierarchy + KG-RAG + symbolic verification) achieves Plan Correctness up to 94.19%, with ablations revealing that removal of any component leads to pronounced declines (e.g., HR 54.32%).

6. Domain-Specific Considerations and Extensions

Implementations of HF-RAG adapt the hierarchical fusion principle to domain requirements:

Graph-based code showcases fusion of multi-level semantics and structural dependencies, requiring type compatibility enforcement and AST/CFG/DFG preservation.
Tabular-text hybrid QA encodes hierarchies textually to leverage passage-based retrievers without custom neural fusion operators.
Robotic planning employs subgraph retrieval and symbolic validators for hierarchical decomposition and correctness, promoting reliability in long-horizon tasks.

A plausible implication is that hierarchical fusion frameworks generalize across data types, provided that semantic compatibility and score normalization are rigorously handled. This suggests further research on optimal fusion operators, latent alignment, and hybrid prompt construction for multi-modal, multi-source RAG systems.

7. Challenges and Future Directions

Key open challenges include:

Score scale mismatch across heterogeneous sources and retrievers.
Structural compatibility and information preservation during cross-hierarchy fusion.
Computational and memory efficiency in multi-level graph and passage fusion.
Fine-grained ablation of fusion strategy benefits across domains (code, text, tabular, graph/planning).

Future research may address fusion at deeper architectural layers (cross-modal transformers, hybrid latent fusion), develop new retrieval normalization schemes beyond z-score, and formalize hierarchy-aware context selection for greater generalization and robustness in RAG-based systems. Recent works highlight the importance of multi-level fusion for both in-domain accuracy and out-of-domain generalization (Santra et al., 2 Sep 2025), reinforcing its centrality in modern retrieval-augmented generation frameworks.