Papers
Topics
Authors
Recent
AI Research Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 85 tok/s
Gemini 2.5 Pro 46 tok/s Pro
GPT-5 Medium 16 tok/s Pro
GPT-5 High 10 tok/s Pro
GPT-4o 108 tok/s Pro
Kimi K2 192 tok/s Pro
GPT OSS 120B 455 tok/s Pro
Claude Sonnet 4 31 tok/s Pro
2000 character limit reached

EraRAG: Efficient and Incremental Retrieval Augmented Generation for Growing Corpora (2506.20963v2)

Published 26 Jun 2025 in cs.IR and cs.LG

Abstract: Graph-based Retrieval-Augmented Generation (Graph-RAG) enhances LLMs by structuring retrieval over an external corpus. However, existing approaches typically assume a static corpus, requiring expensive full-graph reconstruction whenever new documents arrive, limiting their scalability in dynamic, evolving environments. To address these limitations, we introduce EraRAG, a novel multi-layered Graph-RAG framework that supports efficient and scalable dynamic updates. Our method leverages hyperplane-based Locality-Sensitive Hashing (LSH) to partition and organize the original corpus into hierarchical graph structures, enabling efficient and localized insertions of new data without disrupting the existing topology. The design eliminates the need for retraining or costly recomputation while preserving high retrieval accuracy and low latency. Experiments on large-scale benchmarks demonstrate that EraRag achieves up to an order of magnitude reduction in update time and token consumption compared to existing Graph-RAG systems, while providing superior accuracy performance. This work offers a practical path forward for RAG systems that must operate over continually growing corpora, bridging the gap between retrieval efficiency and adaptability. Our code and data are available at https://github.com/EverM0re/EraRAG-Official.

Summary

  • The paper introduces EraRAG, which efficiently integrates dynamic updates in retrieval-augmented generation systems through hyperplane-based LSH.
  • It constructs a multi-layered retrieval graph to minimize token usage and rebuild time, achieving up to 57.6% and 77.5% reductions over traditional methods.
  • The method supports high-accuracy question answering by balancing detailed semantic retrieval with efficient localized re-partitioning.

EraRAG: Efficient and Incremental Retrieval Augmented Generation for Growing Corpora

The paper introduces EraRAG, an innovative framework for Retrieval-Augmented Generation (RAG) designed to efficiently handle dynamic updates in continuously evolving corpora. This approach addresses the challenges that current Graph-RAG systems face, especially their inefficiency in dealing with incremental data updates. EraRAG offers a scalable solution by structuring retrieval over an external corpus with a multi-layered graph architecture capable of fast, localized updates.

Motivation and Background

Existing RAG methods enhance LLMs by integrating external knowledge, thereby improving their ability to handle domain-specific queries, multi-hop reasoning, and deep contextual understanding. However, these systems often operate with static corpora, which necessitate expensive full-graph reconstruction when new documents are introduced. This limitation significantly restricts scalability in real-world applications where data grows continuously.

EraRAG addresses this issue by leveraging hyperplane-based Locality-Sensitive Hashing (LSH) to organize and manage data updates. This design eliminates the need for costly retraining or global graph recomputation while maintaining high retrieval accuracy and low latency, making it a practical approach for dynamic environments.

Implementation Details

Graph Construction

EraRAG constructs a hierarchical retrieval graph by organizing the input corpus into a multi-layered structure. The process involves partitioning corpus chunks into groups using LSH-based segmentation, designed to ensure semantic coherence while allowing for efficient updates. Each chunk is encoded into an embedding vector and hashed using a set of hyperplanes to determine its grouping, facilitating efficient semantic clustering. Figure 1

Figure 1: Overview of EraRAG. The framework constructs a hierarchical retrieval graph, with efficient update mechanisms via selective re-partitioning and summarization.

Dynamic Updates

EraRAG introduces a localized update strategy to handle dynamic data efficiently. New data is inserted by selectively re-partitioning and re-summarizing only the affected graph segments, which minimizes computational overhead. This strategy ensures that updates remain efficient without disrupting existing graph topology, significantly reducing token consumption and rebuilding time. Figure 2

Figure 2: Token cost and graph rebuild time throughout insertions.

Query Processing

During query processing, EraRAG employs a collapsed graph search strategy to integrate both detailed leaf node information and high-level semantic abstractions encoded in the summary nodes. This approach allows for flexible responses to various query types, maintaining a balance between retrieval granularity and efficiency.

Performance Evaluation

Static QA Performance

EraRAG demonstrates superior performance over existing Graph-RAG systems on multiple QA benchmarks. It achieves notable improvements in Accuracy and Recall, particularly excelling in datasets requiring deep reading comprehension and multi-hop reasoning. These gains are largely attributed to its balanced graph segmentation and efficient summarization strategies. Figure 3

Figure 3: Token processed (left) of EraRAG and baselines via initial graph construction and consecutive insertions. Detailed performance (right) of EraRAG and baselines on QuALITY.

Dynamic Insertion Efficiency

EraRAG's dynamic update mechanism showcases a significant reduction in both token usage and graph construction time, achieving up to 57.6% and 77.5% reductions compared to existing solutions, respectively. This efficiency highlights the framework's suitability for real-world applications with rapidly evolving corpora.

Conclusion

EraRAG provides a scalable, efficient system for retrieval-augmented generation, enabling real-time adaptation to growing datasets without compromising retrieval performance. Its innovative use of LSH for dynamic partitioning and targeted re-summarization offers significant computational savings and robustness, offering a clear path forward for practical RAG implementations in dynamic information environments.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 2 posts and received 14 likes.