Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 175 tok/s
Gemini 2.5 Pro 54 tok/s Pro
GPT-5 Medium 38 tok/s Pro
GPT-5 High 37 tok/s Pro
GPT-4o 108 tok/s Pro
Kimi K2 180 tok/s Pro
GPT OSS 120B 447 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

LogicRAG Framework

Updated 15 November 2025
  • LogicRAG is a dynamic framework that decomposes complex queries into minimal subproblems and constructs a directed acyclic graph for adaptive reasoning.
  • It employs on-the-fly query decomposition and topological sorting to efficiently orchestrate multi-step retrieval and answer generation.
  • Graph and context pruning strategies in LogicRAG reduce token usage by up to 70%, significantly improving accuracy and resource efficiency in multi-hop tasks.

The LogicRAG framework is a logic-aware retrieval-augmented generation system for LLMs that dispenses with pre-built graphs. Instead, it dynamically decomposes complex queries into subproblems, determines their dependencies, and orchestrates adaptive, efficient knowledge retrieval and multi-step reasoning entirely at inference time. LogicRAG is designed to improve both accuracy and resource efficiency in complex question answering settings, especially those requiring multi-hop reasoning, by modeling the query’s latent logic structure with a dynamically built directed acyclic graph (DAG) rather than a static corpus-wide graph.

1. Motivation and Problem Definition

LLMs are susceptible to hallucination—generating factually incorrect responses—when confronted with queries outside their training distribution or knowledge scope. Retrieval-augmented generation (RAG) mitigates this by grounding LLMs with relevant passages from external corpora: C=R(Q),A=fLLM(QC)C = R(Q), \quad A = f_{\mathrm{LLM}}(Q \mid C) where QQ is a query, KK is a corpus, R()R(\cdot) a retrieval function, and fLLMf_{\mathrm{LLM}} the LLM.

Graph-based RAG (GraphRAG) methods have leveraged offline-constructed knowledge graphs for retrieval, showing improvements on complex multi-hop questions. However, such approaches incur heavy preprocessing costs—requiring transformation of the entire corpus into a graph, consuming thousands of tokens and many minutes even for moderate corpora. Furthermore, static graphs are query-agnostic; their edges and structure may not fit the logical requirements of individual queries, leading to misaligned retrieval, inefficiency, and update latency.

LogicRAG addresses these limitations by dynamically discovering a problem-specific reasoning structure at inference time, decomposing the query, extracting a dependency DAG, and controlling retrieval generation adaptively without the need for a pre-built graph.

2. Dynamic Query Decomposition and DAG Construction

Central to LogicRAG is on-the-fly query decomposition. An LLM-based decomposition function

fdecomp:QP={p1,,pn}f_{\mathrm{decomp}} : Q \mapsto P = \{p_1, \ldots, p_n\}

segments QQ into minimal, non-overlapping subproblems that together cover all required knowledge. Decomposition is operationalized with few-shot prompting, e.g., “Segment the question into minimal reasoning steps.” Completeness and non-overlap are enforced.

LogicRAG then induces a directed acyclic graph

G=(V,E),V={vi},  vipiG = (V, E), \quad V = \{v_i\},\; v_i \leftrightarrow p_i

where edges (vivj)E(v_i \rightarrow v_j) \in E are present if pjp_j depends on the answer to pip_i. Edges are inferred by prompting the LLM on stepwise dependencies, and a DFS check ensures acyclicity. This process encodes the latent reasoning pathway optimal for each query instance.

3. Retrieval and Reasoning Scheduling via Graph Linearization

To guide execution, LogicRAG linearizes the DAG using topological sort: σ=[v(1),v(2),,v(n)],with rank(v(i))=i\sigma = [v_{(1)}, v_{(2)}, \ldots, v_{(n)}], \quad \text{with } \mathrm{rank}(v_{(i)}) = i ensuring dependencies are resolved in order. This sequence allows the logic-respecting scheduling of retrieval and subproblem answering. The topological sort is implemented via DFS in O(V+E)\mathcal{O}(|V| + |E|) time, as shown below:

1
2
3
4
5
6
7
8
9
10
11
12
13
def TopoSort(G):
    visited = set()
    stack = []
    def DFS(v):
        visited.add(v)
        for u in Neighbors(v):
            if u not in visited:
                DFS(u)
        stack.append(v)
    for v in V:
        if v not in visited:
            DFS(v)
    return reversed(stack)
At each stage, a merged or individual subproblem is resolved, and its context retrieved in alignment with the global reasoning chain.

4. Pruning Mechanisms for Efficiency

LogicRAG implements two pruning strategies to optimize both accuracy and resource efficiency:

  • Graph Pruning: At each topological level rr, a sibling set S(r)S^{(r)} of subproblems is scored for semantic similarity. If similarity sim(pi,pj)>τ\mathrm{sim}(p_i,p_j) > \tau, subproblems are merged into a unified query quniq^{\mathrm{uni}} via the LLM, effectively reducing redundant retrieval.
  • Context Pruning: As retrieval progresses, a rolling memory Mem(r)\mathrm{Mem}^{(r)} accumulates the most salient retrieved information. It is updated by summarization: Mem(r)=Summarize(Mem(r1)C(r))\mathrm{Mem}^{(r)} = \mathrm{Summarize}(\mathrm{Mem}^{(r-1)} \cup C^{(r)}) Any retrieved passage dd with score R(q,d)<δR(q,d) < \delta is dropped. This filtering limits token bloat and maintains a high signal-to-noise ratio for downstream reasoning.

5. Adaptive Retrieval and Generation Pipeline

For each (possibly merged) subproblem qq, LogicRAG retrieves the top-kk passages: C={dj:R(q,dj)}j=1k,R(q,dj)=embed(q),embed(dj)C = \{d_j : R(q, d_j)\}_{j=1}^k, \quad R(q, d_j)=\langle \mathrm{embed}(q),\,\mathrm{embed}(d_j)\rangle using cosine similarity between query and document embeddings.

The reasoning pipeline operates as a forward pass across sorted DAG levels. For each rr:

  • Construct q(r)q^{(r)} by merging S(r)S^{(r)} subproblems.
  • Retrieve C(r)C^{(r)}.
  • Summarize to update Mem(r)\mathrm{Mem}^{(r)}.
  • Prompt the LLM for each piS(r)p_i \in S^{(r)} with the rolling memory:
    1
    2
    3
    
    "Given rolling memory Mem^(r), answer subproblem p_i:
      [p_i]
      Context: [Mem^(r)]"
    If novel subproblems are articulated by the LLM, these are dynamically added to GG and processed recursively, ensuring completeness.

6. Experimental Results, Efficiency, and Illustrative Example

LogicRAG was evaluated on HotpotQA (2-hop), 2WikiMQA (2–4 hops), and MuSiQue (composed single-hop) datasets against baselines including vanilla RAG (various kk), zero-shot LLMs, and state-of-the-art GraphRAG-style models (KGP, RAPTOR, GraphRAG, LightRAG, HippoRAG, HippoRAG2). Key findings:

  • On 2WikiMQA, string-match accuracy increased from 50.0% to 64.7% (+14.7 pp over the best baseline).
  • Average token consumption on 2WikiMQA was reduced to \sim1.8K versus 2.8–4.7K for GraphRAG variants.
  • Latency per question decreased to \sim9.8 seconds, compared to 13–35 seconds for graph-based methods.
  • Combined pruning mechanisms (graph and context) reduced token usage per query by 60–70%, without significant impact on accuracy for k>5k > 5 retrieval (with further kk increases offering minimal gains but linearly growing resource usage).

A three-step illustrative example is as follows. For the question, “What month did Tripartite discussions begin between Britain, France, and the country…?”:

  1. Decomposition: p1p_1 (identify country), p2p_2 (decode “nobilities commonwealth”), p3p_3 (find month).
  2. DAG construction: v2v1v3v_2 \rightarrow v_1 \rightarrow v_3, reflecting dependency chains.
  3. Topological sort: σ=[v2,v1,v3]\sigma = [v_2, v_1, v_3].
  4. Iterative reasoning:
    • r=1r=1, q(1)q^{(1)}: “What historic entity does ‘nobilities commonwealth’ refer to?” (Answer: "Polish–Lithuanian Commonwealth")
    • r=2r=2, q(2)q^{(2)}: “Given Mem(1)^{(1)}, what country did Warsaw Pact leadership originate from?” (Answer: "the Soviet Union")
    • r=3r=3, q(3)q^{(3)}: “When (month) did Tripartite discussions (Britain, France, Soviet Union) begin?” (Answer: "June")
    • Context pruning distills \sim600 tokens of retrieved content per round to \sim150 core tokens, while graph pruning was not needed in this instance.

7. Significance and Context within Retrieval-Augmented Reasoning

LogicRAG demonstrates that adaptive, inference-stage logic modeling outperforms prior static GraphRAG approaches both in answer quality and efficiency. By eschewing any global, pre-computed graph, it minimizes preprocessing overhead, reduces per-query resource consumption, and enables dynamic alignment between retrieval structure and query logic. The methodology also underscores the greater generality and extensibility of logic-structured retrieval—facilitating multi-step, dependency-aware reasoning for arbitrary queries encountered at inference time. Extensive benchmarks show that dynamic logic-aware retrieval both outperforms prior pre-built graph baselines and achieves 30–60% savings in token and latency costs for complex multi-hop question answering tasks (Chen et al., 8 Aug 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)
Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to LogicRAG Framework.