Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
131 tokens/sec
GPT-4o
10 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

UltraRAG: A Modular and Automated Toolkit for Adaptive Retrieval-Augmented Generation (2504.08761v1)

Published 31 Mar 2025 in cs.IR

Abstract: Retrieval-Augmented Generation (RAG) significantly enhances the performance of LLMs in downstream tasks by integrating external knowledge. To facilitate researchers in deploying RAG systems, various RAG toolkits have been introduced. However, many existing RAG toolkits lack support for knowledge adaptation tailored to specific application scenarios. To address this limitation, we propose UltraRAG, a RAG toolkit that automates knowledge adaptation throughout the entire workflow, from data construction and training to evaluation, while ensuring ease of use. UltraRAG features a user-friendly WebUI that streamlines the RAG process, allowing users to build and optimize systems without coding expertise. It supports multimodal input and provides comprehensive tools for managing the knowledge base. With its highly modular architecture, UltraRAG delivers an end-to-end development solution, enabling seamless knowledge adaptation across diverse user scenarios. The code, demonstration videos, and installable package for UltraRAG are publicly available at https://github.com/OpenBMB/UltraRAG.

Summary

  • The paper introduces an end-to-end toolkit that automates and simplifies the RAG workflow through a modular design and a user-friendly WebUI.
  • It details comprehensive modules for knowledge management, data construction, fine-tuning with SFT/DPO, and standardized evaluation across 40+ benchmarks.
  • Case studies in the legal domain show significant improvements in retrieval and generation metrics, validating UltraRAG’s adaptive capabilities.

Retrieval-Augmented Generation (RAG) is a technique used to improve the performance of LLMs by providing them with external knowledge. While effective, building and deploying RAG systems can be challenging due to diverse data formats, complex component coordination, and the rapid evolution of algorithms. Existing RAG toolkits like LangChain and LlamaIndex offer modularity but often lack user-friendly interfaces, comprehensive knowledge management features, and crucial support for adapting RAG systems to specific domains or tasks, limiting their practical applicability.

UltraRAG is introduced as a modular and automated toolkit designed to address these challenges, particularly focusing on facilitating knowledge adaptation throughout the RAG workflow. It provides an end-to-end solution covering data construction, training, evaluation, and inference, making it easier for both researchers and practitioners to build and optimize RAG systems. A key feature of UltraRAG is its user-friendly WebUI, which lowers the technical barrier, allowing users to manage knowledge bases, configure models, and run experiments without extensive coding.

The toolkit is structured around two global setting modules and three core functional modules:

  1. Global Setting Modules:
    • Model Management: Enables the management, deployment, and usage of various models required for RAG (retrieval, reranker, generation). It supports loading local models via vLLM (2310.04605) or HuggingFace Transformers (2010.03701), as well as integrating API-based models. It provides pre-configured environments (Docker, microservices) for seamless model integration.
    • Knowledge Management: Simplifies handling external knowledge bases. Users can upload files in various formats (TXT, PDF, Markdown, JSON, CSV). The module allows configuration of processing parameters like chunk size and overlap and automates the encoding and indexing of documents using a selected embedding model.
  2. Functional Modules:
    • Data Construction: Automates the generation of training and evaluation data tailored to the RAG pipeline and specific knowledge bases. It employs techniques to generate queries from documents and construct datasets for retrieval, reranking, and generation models, including mining hard negative samples (2007.00808) for retrieval and creating SFT and DPO (2310.11452) datasets for generation models. Users can also upload custom datasets and mix data sources for multi-task training.
    • Training: Supports finetuning embedding models and generation models using the data generated by the data construction module. Currently supports Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO) with techniques like LoRA (2106.09685), with plans for more strategies.
    • Evaluation & Inference: Provides tools for comprehensive assessment using standard retrieval and generation metrics across over 40 benchmark datasets. It supports custom datasets and a unified data format for flexible evaluation. The inference module offers predefined RAG workflows (Vanilla RAG, KBAlign (2411.14790), VisRAG (2410.10594), Adaptive-Note (2410.08821)) and allows users to build custom pipelines. It also supports streaming output, visualization of intermediate steps, and local deployment using Ollama for easily creating customized RAG applications.

UltraRAG pre-implements several popular RAG methods, including Vanilla RAG, RA-DIT (2310.01352), Adaptive-Note (2410.08821), VisRAG (2410.10594), KBAlign (2411.14790), and RAG-DDR (2410.13509), aiming to accelerate research by providing reproducible baselines and facilitating fair comparisons.

To demonstrate its knowledge adaptation capability, the paper presents a case paper focusing on building a legal-domain RAG system. Leveraging UltraRAG's modules, a legal knowledge base (1,000+ books) was ingested and indexed. The embedding model (MiniCPM-Embedding-Light) was finetuned using 2,800 synthetically generated samples from the legal corpus via the Data Construction module. This finetuning resulted in improved retrieval performance (MRR@10, NDCG@10, Recall@10) on legal domain samples, indicating better adaptation to legal terminology and document structures.

For generation, the MiniCPM-3-4B model was finetuned using two methods implemented in UltraRAG: UltraRAG-DDR (2410.13509) (using differentiable data rewards and DPO) and UltraRAG-KBAlign (2411.14790) (using SFT with combined short- and long-range annotations). Experiments on LawBench (2309.16289) tasks (Article Prediction and Consultation) showed that both finetuned models outperformed VanillaRAG. UltraRAG-DDR achieved a significant relative improvement in ROUGE-L score for Law Prediction. For the Consultation task, the RAGAdaptation workflow (using UltraRAG-Embedding and UltraRAG-DDR finetuned models) achieved the best results, surpassing both VanillaRAG and the DeepNote workflow. Case studies further illustrate that the finetuned models accurately reference specific legal articles, unlike a vanilla RAG setup which might return less specific or incorrect information.

In summary, UltraRAG provides a practical, end-to-end, and automated solution for developing and deploying RAG systems. Its modular design, user-friendly WebUI, and focus on knowledge adaptation through automated data construction and training processes make it a valuable toolkit for researchers and practitioners aiming to build high-performing, domain-specific RAG applications. The legal domain case paper effectively demonstrates its capability to enhance performance through targeted knowledge adaptation.

Github Logo Streamline Icon: https://streamlinehq.com
X Twitter Logo Streamline Icon: https://streamlinehq.com