An Expert is Worth One Token: Synergizing Multiple Expert LLMs as Generalist via Expert Token Routing
Abstract: We present Expert-Token-Routing, a unified generalist framework that facilitates seamless integration of multiple expert LLMs. Our framework represents expert LLMs as special expert tokens within the vocabulary of a meta LLM. The meta LLM can route to an expert LLM like generating new tokens. Expert-Token-Routing not only supports learning the implicit expertise of expert LLMs from existing instruction dataset but also allows for dynamic extension of new expert LLMs in a plug-and-play manner. It also conceals the detailed collaboration process from the user's perspective, facilitating interaction as though it were a singular LLM. Our framework outperforms various existing multi-LLM collaboration paradigms across benchmarks that incorporate six diverse expert domains, demonstrating effectiveness and robustness in building generalist LLM system via synergizing multiple expert LLMs.
- Qwen technical report.
- Disc-finllm: A chinese financial large language model based on multiple experts fine-tuning. arXiv preprint arXiv:2310.15205.
- Improving factuality and reasoning in language models through multiagent debate.
- Deepseek-coder: When the large language model meets programming – the rise of code intelligence.
- ToolkenGPT: Augmenting frozen language models with massive tools via tool embeddings. In Thirty-seventh Conference on Neural Information Processing Systems.
- Measuring massive multitask language understanding.
- Infiagent-dabench: Evaluating agents on data analysis tasks.
- Llm-blender: Ensembling large language models with pairwise ranking and generative fusion.
- Unnatural language processing: How do language models handle machine-generated prompts?
- The power of scale for parameter-efficient prompt tuning.
- Camel: Communicative agents for "mind" exploration of large language model society. In Thirty-seventh Conference on Neural Information Processing Systems.
- Can llm already serve as a database interface? a big bench for large-scale database grounded text-to-sqls.
- Xiang Lisa Li and Percy Liang. 2021. Prefix-tuning: Optimizing continuous prompts for generation.
- Dynamic llm-agent network: An llm-agent collaboration framework with agent team optimization.
- Ilya Loshchilov and Frank Hutter. 2019. Decoupled weight decay regularization.
- Routing to the expert: Efficient reward-guided ensemble of large language models.
- Code llama: Open foundation models for code.
- Deepseekmath: Pushing the limits of mathematical reasoning in open language models.
- Mirac Suzgun and Adam Tauman Kalai. 2024. Meta-prompting: Enhancing language models with task-agnostic scaffolding.
- Stanford alpaca: An instruction-following llama model.
- Llama: Open and efficient foundation language models.
- Survey on factuality in large language models: Knowledge, retrieval and domain-specificity.
- Unleashing the emergent cognitive synergy in large language models: A task-solving agent through multi-persona self-collaboration.
- π𝜋\piitalic_π-tuning: Transferring multimodal foundation models with optimal multi-task interpolation.
- Expertprompting: Instructing large language models to be distinguished experts.
- Bertscore: Evaluating text generation with bert. In International Conference on Learning Representations.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.