MobileRAG: A Fast, Memory-Efficient, and Energy-Efficient Method for On-Device RAG (2507.01079v1)

Published 1 Jul 2025 in cs.DB

Abstract: Retrieval-Augmented Generation (RAG) has proven effective on server infrastructures, but its application on mobile devices is still underexplored due to limited memory and power resources. Existing vector search and RAG solutions largely assume abundant computation resources, making them impractical for on-device scenarios. In this paper, we propose MobileRAG, a fully on-device pipeline that overcomes these limitations by combining a mobile-friendly vector search algorithm, \textit{EcoVector}, with a lightweight \textit{Selective Content Reduction} (SCR) method. By partitioning and partially loading index data, EcoVector drastically reduces both memory footprint and CPU usage, while the SCR method filters out irrelevant text to diminish LLM (LM) input size without degrading accuracy. Extensive experiments demonstrated that MobileRAG significantly outperforms conventional vector search and RAG methods in terms of latency, memory usage, and power consumption, while maintaining accuracy and enabling offline operation to safeguard privacy in resource-constrained environments.

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tweets

https://twitter.com/_reachsumit/status/1940618472817676630

MobileRAG: A Fast, Memory-Efficient, and Energy-Efficient Method for On-Device RAG (2507.01079v1)

Summary

Related Papers

Tweets