SelectLLM: Query-Aware Efficient Selection Algorithm for Large Language Models (2408.08545v1)

Published 16 Aug 2024 in cs.CL

Abstract: LLMs have gained increased popularity due to their remarkable success across various tasks, which has led to the active development of a large set of diverse LLMs. However, individual LLMs have limitations when applied to complex tasks because of such factors as training biases, model sizes, and the datasets used. A promising approach is to efficiently harness the diverse capabilities of LLMs to overcome these individual limitations. Towards this goal, we introduce a novel LLM selection algorithm called SelectLLM. This algorithm directs input queries to the most suitable subset of LLMs from a large pool, ensuring they collectively provide the correct response efficiently. SelectLLM uses a multi-label classifier, utilizing the classifier's predictions and confidence scores to design optimal policies for selecting an optimal, query-aware, and lightweight subset of LLMs. Our findings show that the proposed model outperforms individual LLMs and achieves competitive performance compared to similarly sized, computationally expensive top-performing LLM subsets. Specifically, with a similarly sized top-performing LLM subset, we achieve a significant reduction in latency on two standard reasoning benchmarks: 13% lower latency for GSM8K and 70% lower latency for MMLU. Additionally, we conduct comprehensive analyses and ablation studies, which validate the robustness of the proposed model.

Citations (1)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tweets

https://twitter.com/KaushalMaurya94/status/1827256032487191007

https://twitter.com/KaushalMaurya94/status/1894329574227329235

https://twitter.com/KaushalMaurya94/status/1936026521279308166

YouTube

Show All Videos

SelectLLM: Query-Aware Efficient Selection Algorithm for Large Language Models (2408.08545v1)

Summary

Related Papers

Tweets

YouTube