GraphSearch: An Agentic Deep Searching Workflow for Graph Retrieval-Augmented Generation (2509.22009v2)

Published 26 Sep 2025 in cs.CL

Abstract: Graph Retrieval-Augmented Generation (GraphRAG) enhances factual reasoning in LLMs by structurally modeling knowledge through graph-based representations. However, existing GraphRAG approaches face two core limitations: shallow retrieval that fails to surface all critical evidence, and inefficient utilization of pre-constructed structural graph data, which hinders effective reasoning from complex queries. To address these challenges, we propose \textsc{GraphSearch}, a novel agentic deep searching workflow with dual-channel retrieval for GraphRAG. \textsc{GraphSearch} organizes the retrieval process into a modular framework comprising six modules, enabling multi-turn interactions and iterative reasoning. Furthermore, \textsc{GraphSearch} adopts a dual-channel retrieval strategy that issues semantic queries over chunk-based text data and relational queries over structural graph data, enabling comprehensive utilization of both modalities and their complementary strengths. Experimental results across six multi-hop RAG benchmarks demonstrate that \textsc{GraphSearch} consistently improves answer accuracy and generation quality over the traditional strategy, confirming \textsc{GraphSearch} as a promising direction for advancing graph retrieval-augmented generation.

Summary

The paper introduces an agentic workflow that leverages dual-channel retrieval to integrate semantic and relational data for enhanced evidence collection.
It employs a modular pipeline—including Query Decomposition, Context Refinement, and Query Expansion—to support iterative, multi-turn reasoning in LLMs.
Experimental results show significant gains in retrieval precision and performance metrics such as SubEM, A-Score, and E-Score across multi-hop benchmarks.

GraphSearch: An Agentic Deep Searching Workflow for Graph Retrieval-Augmented Generation

GraphSearch represents a novel approach in the domain of graph retrieval-augmented generation (GraphRAG), addressing key limitations associated with existing paradigms, particularly shallow retrieval and inefficient utilization of graph data. This framework introduces an agentic deep searching methodology, leveraging dual-channel retrieval to holistically integrate semantic queries and relational graphs. Here we explore the technical intricacies and experimental efficacy of GraphSearch, elucidating its potential in enhancing factual reasoning within LLMs.

Background

GraphRAG frameworks historically enhance factual reasoning in LLMs via graph-based representations, yet challenges such as shallow retrieval and suboptimal graph data utilization persist. GraphRAGs often operate with a single-round retrieval strategy, leading to inadequate evidence discovery required for complex queries. GraphSearch mitigates these issues through structured graph knowledge bases (KBs) coupled with modular workflows to facilitate multi-turn interactions and iterative reasoning.

GraphSearch Framework

Modular Deep Searching Pipeline

GraphSearch's architecture is distinctly modular, composed of six interconnected modules:

Query Decomposition (QD): Decomposes complex queries into manageable sub-queries, enabling fine-grained evidence retrieval.
Context Refinement (CR): Filters redundant information to highlight pertinent entities and relationships.
Query Grounding (QG): Ensures queries are contextually enriched using intermediate answers from previous retrievals.
Logic Drafting (LD): Constructs a coherent reasoning chain from available evidence.
Evidence Verification (EV): Evaluates logical consistency and sufficiency of the reasoning chain.
Query Expansion (QE): Generates additional sub-queries to address any identified knowledge gaps.

The framework's agentic capabilities allow for iterative retrieval and reflection, enhancing the reasoning quality over multiple interaction rounds.

Figure 1: Overview of our GraphSearch framework.

Dual-Channel Retrieval Strategy

GraphSearch employs a dual-channel retrieval methodology:

Semantic Channel: Retrieves descriptive evidence from text chunks using semantically coherent sub-queries.
Relational Channel: Utilizes subject-predicate-object relations to retrieve structured graph data, guiding multi-hop reasoning by employing subgraph entities and relations.

This dual-channel system not only exploits the synergy between text and graph modalities but does so in a way that aligns with their respective functional roles, providing a robust backbone for complex query resolution.

Experimental Evaluation

Experiments conducted across six multi-hop RAG benchmarks demonstrate the superiority of GraphSearch over traditional approaches. GraphSearch consistently improves performance metrics such as SubEM, A-Score, and E-Score across benchmarks like HotpotQA, MuSiQue, and 2WikiMultiHopQA.

Key Findings

Performance Enhancement: GraphSearch outperforms baseline GraphRAG approaches by enabling more comprehensive evidence retrieval through iterative reasoning and multi-turn interactions.
Plug-and-Play Capability: The framework integrates seamlessly with existing GraphRAG methods, enhancing retrieval effectiveness regardless of the underlying graph KB configuration.
Efficiency Under Constraints: GraphSearch maintains strong retrieval fidelity even under reduced retrieval budgets, highlighting its potential for low-resource environments.
Figure 2: Comparisons between dual-channel and single-channel retrieval in GraphSearch.

Conclusion

GraphSearch exemplifies a significant advancement in graph retrieval-augmented generation by addressing the limitations of shallow retrieval and modality underutilization. The agentic architecture fosters deep integration of evidence collection and logic refinement, empowering LLMs to achieve enhanced factual accuracy in complex reasoning tasks. Future work could explore the tuning of GraphSearch with advanced learning strategies and its applicability in multimodal retrieval contexts.