Mitigate the computational cost of reasoning models used in RAG
Establish methods to reduce the computational cost—including token consumption and inference latency—of reasoning large language models used as generators in Retrieval-Augmented Generation pipelines for multi-hop question answering.
Sponsor
References
However, the cost problem of reasoning models remains unsolved.
— LIR$^3$AG: A Lightweight Rerank Reasoning Strategy Framework for Retrieval-Augmented Generation
(2512.18329 - Chen et al., 20 Dec 2025) in Section 3.1 (Reasoning Models in RAG)