- The paper introduces a reason-and-construct paradigm that dynamically generates query-specific evidence graphs to address knowledge graph incompleteness.
- It leverages a unified semantic space and a query-aware beam search to filter out distractor facts and enhance multi-hop reasoning.
- Experimental evaluations on five ODQA benchmarks show average gains of 5.4% in EM and 5.2% in F1, demonstrating its effectiveness and resilience.
Relink: Constructing Query-Driven Evidence Graph On-the-Fly for GraphRAG
Introduction
The paper "Relink: Constructing Query-Driven Evidence Graph On-the-Fly for GraphRAG" addresses the limitations of traditional Graph-based Retrieval-Augmented Generation (GraphRAG) methods. These methods mitigate hallucinations in LLMs by grounding them in structured knowledge but traditionally rely on a build-then-reason paradigm using static, pre-constructed Knowledge Graphs (KGs). This leads to two significant issues: KG incompleteness, which breaks reasoning paths, and low signal-to-noise ratios, introducing query-relevant distractor facts.
To address these challenges, the paper proposes Relink, a novel framework embodying a reason-and-construct paradigm. Relink dynamically constructs query-specific evidence graphs by instantiating required facts from latent relation pools derived from the original text corpus. This enables on-the-fly path repair and filtering of distractor facts, providing precise and query-aligned reasoning paths.
Figure 1: Static GraphRAG failures vs. Relink's Dynamic Construction. Pre-built knowledge graphs cause two critical failures in GraphRAG: (a) missing links breaking reasoning paths, and (b) distractor facts (query-relevant but goal-misaligned). In contrast, our reason-and-construct approach, Relink, addresses both by discarding distractor facts and dynamically instantiating missing ones from the latent relations derived from the original text corpus.
Proposed Framework
Heterogeneous Knowledge Source Construction
Relink addresses the KG incompleteness challenge by integrating high-precision factual KGs with high-recall latent relation pools. The factual KG serves as a reliable backbone with high-confidence relations, while the latent relation pool is constructed from entity co-occurrences in text, capturing additional associations via PMI filtering. Each latent relation is encoded using pretrained LLMs to create dense representations, allowing dynamic repair of broken paths by constructing missing facts required for query resolution.
Query-Driven Dynamic Path Exploration
Central to Relink is the query-driven dynamic path exploration, leveraging a unified semantic space for joint reasoning over explicit KG triples and latent relations. Candidate paths are expanded iteratively using a beam search with a query-aware ranker to prioritize facts contributing directly to the query. The ranker assesses the relevance of candidates by considering utility for the query, enabling active discard of distractors and instantiation of missing relations via LLMs based on contextual embeddings.
Figure 2: Relink's dynamic evidence graph construction. Relink iteratively builds reasoning paths by leveraging candidates from both the explicit KG (Gb​) and latent co-occurrence relation pool (Rc​) derived from the corpus. Encoders EL​ and EF​ project these candidates into a unified semantic space where a query-driven ranker evaluates their relevance.
Experimental Evaluation
Relink's efficacy was tested on five ODQA benchmarks, demonstrating significant improvements over existing GraphRAG baselines, with average gains of 5.4% in EM and 5.2% in F1. The experiments confirm that Relink robustly addresses KG incompleteness and distractors by dynamically constructing query-driven reasoning paths.
(Table 1)
Table 1: Main performance comparison. Relink consistently outperforms all baseline methods across all datasets, demonstrating the effectiveness of its dynamic, query-driven path repair mechanism.
Robustness Analysis
Relink's robustness under conditions of knowledge sparsity was assessed by incrementally reducing KGs. Static GraphRAG methods exhibited significant performance declines with increasing sparsity, while Relink maintained high performance, highlighting its resilience and adaptability through dynamic path construction.
Figure 3: Performance trend as the factual graph is reduced. Relink exhibits remarkable robustness to knowledge sparsity, whereas the baseline's performance collapses.
Conclusion
The paper introduces a paradigm shift from static to dynamic evidence graph construction in GraphRAG, demonstrating superior performance and robustness to knowledge sparsity. This dynamic reason-and-construct approach holds promise for enhancing LLMs in multi-hop reasoning tasks, offering a more adaptable and precise method for ODQA challenges. Future work could explore further integration with multimodal data and expansion to broader reasoning domains, advancing the applicability and depth of AI reasoning systems.
Figure 4: A case study contrasting static reasoning with Relink's dynamic approach. The static baseline (w/o Rc​) is misled by the highly relevant resides in distractor. In contrast, Relink succeeds by dynamically constructing the correct reasoning chain (composer of → born in) and using its query-driven ranker to prioritize it.