Prompt-tuning Mistral-7B-Instruct to Reach REAPER-level Performance
Determine whether in-context prompt tuning, without any fine-tuning, of the Mistral-7B-Instruct-v0.2 language model can achieve high-accuracy retrieval plan generation for the REAPER task in a conversational shopping assistant, specifically producing correct tool sequences and arguments across the six defined retrieval classes (Customer Support, Shipment Status, Product Search, Product QnA, Review Summary, and No-retrieval) at levels comparable to the fine-tuned REAPER model (approximately 96% tool selection accuracy and 92% argument accuracy).
Sponsor
References
Despite several weeks worth of effort, we could not prompt-tune Mistral-7B-Instruct-v0.2 to reach the target performance.
— REAPER: Reasoning based Retrieval Planning for Complex RAG Systems
(2407.18553 - Joshi et al., 26 Jul 2024) in Section 6.1 (Comparison with Open Models)