Papers

Topics

Authors

Recent

View all

Gemini 2.5 Flash

72 tokens/sec

GPT-4o

61 tokens/sec

Gemini 2.5 Pro Pro

44 tokens/sec

o3 Pro

8 tokens/sec

GPT-4.1 Pro

50 tokens/sec

DeepSeek R1 via Azure Pro

28 tokens/sec

2000 character limit reached

454 13

RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval (2401.18059v1)

Published 31 Jan 2024 in cs.CL and cs.LG

Abstract: Retrieval-augmented LLMs can better adapt to changes in world state and incorporate long-tail knowledge. However, most existing methods retrieve only short contiguous chunks from a retrieval corpus, limiting holistic understanding of the overall document context. We introduce the novel approach of recursively embedding, clustering, and summarizing chunks of text, constructing a tree with differing levels of summarization from the bottom up. At inference time, our RAPTOR model retrieves from this tree, integrating information across lengthy documents at different levels of abstraction. Controlled experiments show that retrieval with recursive summaries offers significant improvements over traditional retrieval-augmented LMs on several tasks. On question-answering tasks that involve complex, multi-step reasoning, we show state-of-the-art results; for example, by coupling RAPTOR retrieval with the use of GPT-4, we can improve the best performance on the QuALITY benchmark by 20% in absolute accuracy.

PDF HTML Abstract

Introduction to RAPTOR

Retrieval-augmented LLMs (LMs) have become instrumental in enhancing model performance by supplementing their massive pre-encoded knowledge with data drawn from external corpora. Until now, standard retrieval methods have focused on obtaining short, sequential text snippets without capturing the full document context. The innovative RAPTOR (Recursive Abstractive Processing for Tree-Organized Retrieval) model addresses this limitation by constructing a tree with recursive embeddings, clustering, and summarization, enabling it to retrieve information across lengthy documents at multiple abstraction levels. This research underscores the significance of RAPTOR's ability to fetch contextually rich, ambiguous content spanning discrete document portions.

Comparison with Existing Methods

RAPTOR's contribution is remarkable—it extends the current capabilities of retrieval systems by delivering state-of-the-art performance, particularly in complex, multi-step reasoning tasks. Empirical research exhibits that integrating RAPTOR with current LMs, such as GPT-4, led to a 20% increase in absolute accuracy on benchmarks such as QuALITY. These tasks require comprehensive document understanding, necessitating the integration of knowledge from disparate text parts. RAPTOR's recursive summarization enables the LLM's context retrieval at differentiated granularity, outperforming existing retrieval-augmented methods.

RAPTOR's Technical Concept

The process of building the RAPTOR retrieval tree starts with segmenting text into short chunks embedded using sentence-BERT (SBERT). These chunks are then clustered based on their embeddings using a Gaussian Mixture Model (GMM), and a LLM subsequently summarizes these clusters. The distinct tree structure is progressively built by reembedding and clustering these summaries until clustering is no longer viable. Two key querying methods are employed: tree traversal and collapsed tree. The latter, notably superior, collapses the tree into a single layer and retrieves nodes until a token threshold is met, ensuring adherence to model input size constraints.

Superiority and Scalability of RAPTOR

RAPTOR's methodological prowess is evident through its performance on various datasets. It consistently outperformed baselines across the board with different retrieval systems and LLMs. Notably, RAPTOR aligned with GPT-4 shows a significant leap in F-1 Match scores indicating its superiority. Moreover, analyses confirm that nodes from various layers of the tree are crucial, as full tree querying commands better results than restricted layer-specific retrievals. As for scalability, RAPTOR exhibits linear scalability in build time and token expenditure, rendering it capable of processing large, complex corpora efficiently.

Conclusion and Outlook

RAPTOR sets a new standard in retrieval-augmented LM systems. It adeptly navigates the innate complexity of providing precise context at varying abstraction levels, thus enhancing question-answering capabilities. The model is a testament to the synergetic potential of recursive summarization and structured context retrieval. RAPTOR not only elevates the existing framework of LMs but also demonstrates significant advances in accuracy and efficiency. With the advent of RAPTOR, fine-tuned retrieval processes promise a deeper understanding of textual nuances, pushing the frontiers of generative AI.

PDF Markdown Bookmark Chat (Pro)

References (62)

Authors (6)

Parth Sarthi (1 paper)
Salman Abdullah (2 papers)
Aditi Tuli (1 paper)
Shubh Khanna (2 papers)
Anna Goldie (19 papers)
Christopher D. Manning (169 papers)

Citations (70)

View on Semantic Scholar

Tweets

https://twitter.com/_akhaliq/status/1752888786785112476

https://twitter.com/8teAPi/status/1753488585146470616

https://twitter.com/IntuitMachine/status/1753058544654451100

https://twitter.com/marktenenholtz/status/1753552865669386633

https://twitter.com/fly51fly/status/1754273691058528527

https://twitter.com/bjh_ip/status/1753482920210124933

YouTube

Show All Videos