Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Bayesian inference to improve quality of Retrieval Augmented Generation (2408.08901v1)

Published 12 Aug 2024 in cs.IR

Abstract: Retrieval Augmented Generation or RAG is the most popular pattern for modern LLM or LLM applications. RAG involves taking a user query and finding relevant paragraphs of context in a large corpus typically captured in a vector database. Once the first level of search happens over a vector database, the top n chunks of relevant text are included directly in the context and sent as prompt to the LLM. Problem with this approach is that quality of text chunks depends on effectiveness of search. There is no strong post processing after search to determine if the chunk does hold enough information to include in prompt. Also many times there may be chunks that have conflicting information on the same subject and the model has no prior experience which chunk to prioritize to make a decision. Often times, this leads to the model providing a statement that there are conflicting statements, and it cannot produce an answer. In this research we propose a Bayesian approach to verify the quality of text chunks from the search results. Bayes theorem tries to relate conditional probabilities of the hypothesis with evidence and prior probabilities. We propose that, finding likelihood of text chunks to give a quality answer and using prior probability of quality of text chunks can help us improve overall quality of the responses from RAG systems. We can use the LLM itself to get a likelihood of relevance of a context paragraph. For prior probability of the text chunk, we use the page number in the documents parsed. Assumption is that that paragraphs in earlier pages have a better probability of being findings and more relevant to generalizing an answer.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (1)
  1. Dattaraj Rao (11 papers)