Hierarchical Fusion in HF-RAG
- Hierarchical Fusion in HF-RAG is an approach that integrates evidence from heterogeneous data sources like code, tables, and graphs, enabling multi-level contextual integration.
- The framework employs both intra-source fusion (using techniques such as CombSUM, CombMNZ, and RRF) and inter-source score normalization to merge diverse retrieval outputs effectively.
- Empirical studies show that hierarchical fusion leads to significant improvements in metrics such as EM, F1, and Plan Correctness across domains including code completion, QA, and robotic planning.
Hierarchical Fusion in HF-RAG refers to the integration of evidence and context from heterogeneous sources and structures in Retrieval-Augmented Generation frameworks, leveraging both architectural and score-normalization mechanisms. The approach is motivated by limitations of naive context concatenation and unimodal retrieval, which fail to exploit cross-level dependencies, compatibility among sources, and structural semantics across code, tabular, or graph-based domains. HF-RAG implementations span multi-source document retrieval (Santra et al., 2 Sep 2025), multi-level code and graph fusion (Wang et al., 7 Sep 2025), hybrid tabular-text QA (Zhang et al., 13 Apr 2025), and neuro-symbolic planning (Cornelio et al., 6 Apr 2025), each exemplifying domain-specific hierarchies and fusion operations.
1. Hierarchical Data Structures: Code, Tables, Graphs
HF-RAG methods assume that context is hierarchically organized—by type, scope, or semantic level. In code repositories, GRACE (Wang et al., 7 Sep 2025) models as a three-tier heterogeneous graph comprising:
- Repository-level: Folder structure, cross-file dependency.
- Module-level: Function call graphs, type dependencies, class inheritance.
- Function-level: ASTs, control-flow graphs, data-flow graphs.
Nodes bear attributes such as code snippets, types, and location, and edges are semantically typed (e.g., “call,” “implement”).
For tabular documents, HD-RAG (Zhang et al., 13 Apr 2025) introduces a row-and-column-level (RCL) schema for hierarchical tables:
- Cells are indexed via multi-level header paths and .
- Summaries are generated in two forms: “General RCL” (single-layer flattening) and “H-RCL” (multi-level contextualization by path).
In planning tasks, knowledge graphs serve as hierarchical world-state representations (Cornelio et al., 6 Apr 2025), decomposed into subgraphs relevant for macro and atomic actions.
2. Hierarchical Fusion Mechanisms and Mathematical Frameworks
Fusion in HF-RAG occurs both within sources (intra-source) and between sources (inter-source):
Intra-Source Fusion:
HF-RAG (Santra et al., 2 Sep 2025) aggregates ranked outputs from multiple IR models using schemes such as CombSUM, CombMNZ, and Reciprocal Rank Fusion (RRF):
- CombSUM:
- CombMNZ: , with counting non-zero scores
- RRF:
Inter-Source Fusion:
Scores are standardized per source via z-score normalization:
Final selection merges the top- documents over both sources by highest z-score.
Graph Fusion (Code and KG):
GRACE (Wang et al., 7 Sep 2025) performs node-feature and graph-structural fusion:
- Node-feature fusion: (reranking-score weighting), attention
- Structure fusion: Cross-level edges added to form
Knowledge graph fusion in hierarchical planning (Cornelio et al., 6 Apr 2025) merges retrieved context with macro/atomic plan prompts, followed by symbolic validation. Hidden states are concatenated or added as .
3. Algorithms and Workflow
The fusion process is generally staged as follows:
| Step | Code Fusion (GRACE) (Wang et al., 7 Sep 2025) | Multi-Source Fusion (HF-RAG) (Santra et al., 2 Sep 2025) |
|---|---|---|
| 1 | Parse and encode local graph | Run IR models on each source |
| 2 | Retrieve and rerank subgraphs | Intra-source rank fusion (RRF) |
| 3 | GNN encode query + candidates | Z-score normalize per source |
| 4 | Compute node-feature attention | Merge top-k by z-score across sources |
| 5 | Add cross-edges, merge duplicates | Concatenate for LLM generation |
| 6 | Serialize graph to prompt |
For HD-RAG (Zhang et al., 13 Apr 2025), hierarchical table paths generate RCL summaries, which are treated as “text passages” throughout retrieval and generation—no explicit fusion network is used.
Hierarchical planning (Cornelio et al., 6 Apr 2025) recursively retrieves and fuses KG context into LLM prompts at each plan level, interleaved with symbolic validation.
4. Graph and Data Serialization for LLMs
After hierarchical fusion, context is serialized for LLM input:
- GRACE produces natural-language triples describing node attributes and typed relations, e.g.,
FunctionNode(id123): name=foo(args); calls → FunctionNode(id456); inherits → ClassNode(id789); - HD-RAG converts hierarchical table paths into readable row/column summaries for embedding and prompt assembly.
- KG-RAG fuses entity relations as prompt context at each decomposition level.
Passage-level embedding models ensure that fused outputs are in compatible latent spaces for downstream generation.
5. Empirical Gains and Ablations
Experimental studies demonstrate that hierarchical fusion delivers substantial improvements:
- GRACE (Wang et al., 7 Sep 2025):
Removing graph-fusion drops EM by 5.4 pp and F1 by 6.2 pp on CrossCodeEval-Python, and similar deficits on Java, confirming fusion as critical to code completion accuracy.
- HF-RAG (Santra et al., 2 Sep 2025):
Hierarchical fusion (RRF intra-source + z-score inter-source) yields macro-F1 up to 0.5744, outperforming best baselines by 0.0864 and showing marked out-of-domain generalization on SciFact (+0.05 F1).
- HD-RAG (Zhang et al., 13 Apr 2025):
Outperforms separate table/text retrieval approaches in both retrieval accuracy and complex QA tasks, due to implicit fusion via hierarchical table-path summaries.
- KG-RAG Planning (Cornelio et al., 6 Apr 2025):
HVR (full hierarchy + KG-RAG + symbolic verification) achieves Plan Correctness up to 94.19%, with ablations revealing that removal of any component leads to pronounced declines (e.g., HR 54.32%).
6. Domain-Specific Considerations and Extensions
Implementations of HF-RAG adapt the hierarchical fusion principle to domain requirements:
- Graph-based code showcases fusion of multi-level semantics and structural dependencies, requiring type compatibility enforcement and AST/CFG/DFG preservation.
- Tabular-text hybrid QA encodes hierarchies textually to leverage passage-based retrievers without custom neural fusion operators.
- Robotic planning employs subgraph retrieval and symbolic validators for hierarchical decomposition and correctness, promoting reliability in long-horizon tasks.
A plausible implication is that hierarchical fusion frameworks generalize across data types, provided that semantic compatibility and score normalization are rigorously handled. This suggests further research on optimal fusion operators, latent alignment, and hybrid prompt construction for multi-modal, multi-source RAG systems.
7. Challenges and Future Directions
Key open challenges include:
- Score scale mismatch across heterogeneous sources and retrievers.
- Structural compatibility and information preservation during cross-hierarchy fusion.
- Computational and memory efficiency in multi-level graph and passage fusion.
- Fine-grained ablation of fusion strategy benefits across domains (code, text, tabular, graph/planning).
Future research may address fusion at deeper architectural layers (cross-modal transformers, hybrid latent fusion), develop new retrieval normalization schemes beyond z-score, and formalize hierarchy-aware context selection for greater generalization and robustness in RAG-based systems. Recent works highlight the importance of multi-level fusion for both in-domain accuracy and out-of-domain generalization (Santra et al., 2 Sep 2025), reinforcing its centrality in modern retrieval-augmented generation frameworks.