- The paper introduces an agentic workflow that leverages dual-channel retrieval to integrate semantic and relational data for enhanced evidence collection.
- It employs a modular pipeline—including Query Decomposition, Context Refinement, and Query Expansion—to support iterative, multi-turn reasoning in LLMs.
- Experimental results show significant gains in retrieval precision and performance metrics such as SubEM, A-Score, and E-Score across multi-hop benchmarks.
GraphSearch: An Agentic Deep Searching Workflow for Graph Retrieval-Augmented Generation
GraphSearch represents a novel approach in the domain of graph retrieval-augmented generation (GraphRAG), addressing key limitations associated with existing paradigms, particularly shallow retrieval and inefficient utilization of graph data. This framework introduces an agentic deep searching methodology, leveraging dual-channel retrieval to holistically integrate semantic queries and relational graphs. Here we explore the technical intricacies and experimental efficacy of GraphSearch, elucidating its potential in enhancing factual reasoning within LLMs.
Background
GraphRAG frameworks historically enhance factual reasoning in LLMs via graph-based representations, yet challenges such as shallow retrieval and suboptimal graph data utilization persist. GraphRAGs often operate with a single-round retrieval strategy, leading to inadequate evidence discovery required for complex queries. GraphSearch mitigates these issues through structured graph knowledge bases (KBs) coupled with modular workflows to facilitate multi-turn interactions and iterative reasoning.
GraphSearch Framework
Modular Deep Searching Pipeline
GraphSearch's architecture is distinctly modular, composed of six interconnected modules:
- Query Decomposition (QD): Decomposes complex queries into manageable sub-queries, enabling fine-grained evidence retrieval.
- Context Refinement (CR): Filters redundant information to highlight pertinent entities and relationships.
- Query Grounding (QG): Ensures queries are contextually enriched using intermediate answers from previous retrievals.
- Logic Drafting (LD): Constructs a coherent reasoning chain from available evidence.
- Evidence Verification (EV): Evaluates logical consistency and sufficiency of the reasoning chain.
- Query Expansion (QE): Generates additional sub-queries to address any identified knowledge gaps.
The framework's agentic capabilities allow for iterative retrieval and reflection, enhancing the reasoning quality over multiple interaction rounds.
Figure 1: Overview of our GraphSearch framework.
Dual-Channel Retrieval Strategy
GraphSearch employs a dual-channel retrieval methodology:
- Semantic Channel: Retrieves descriptive evidence from text chunks using semantically coherent sub-queries.
- Relational Channel: Utilizes subject-predicate-object relations to retrieve structured graph data, guiding multi-hop reasoning by employing subgraph entities and relations.
This dual-channel system not only exploits the synergy between text and graph modalities but does so in a way that aligns with their respective functional roles, providing a robust backbone for complex query resolution.
Experimental Evaluation
Experiments conducted across six multi-hop RAG benchmarks demonstrate the superiority of GraphSearch over traditional approaches. GraphSearch consistently improves performance metrics such as SubEM, A-Score, and E-Score across benchmarks like HotpotQA, MuSiQue, and 2WikiMultiHopQA.
Key Findings
- Performance Enhancement: GraphSearch outperforms baseline GraphRAG approaches by enabling more comprehensive evidence retrieval through iterative reasoning and multi-turn interactions.
- Plug-and-Play Capability: The framework integrates seamlessly with existing GraphRAG methods, enhancing retrieval effectiveness regardless of the underlying graph KB configuration.
- Efficiency Under Constraints: GraphSearch maintains strong retrieval fidelity even under reduced retrieval budgets, highlighting its potential for low-resource environments.
Figure 2: Comparisons between dual-channel and single-channel retrieval in GraphSearch.
Conclusion
GraphSearch exemplifies a significant advancement in graph retrieval-augmented generation by addressing the limitations of shallow retrieval and modality underutilization. The agentic architecture fosters deep integration of evidence collection and logic refinement, empowering LLMs to achieve enhanced factual accuracy in complex reasoning tasks. Future work could explore the tuning of GraphSearch with advanced learning strategies and its applicability in multimodal retrieval contexts.