2000 character limit reached
A Systematic Investigation of Distilling Large Language Models into Cross-Encoders for Passage Re-ranking (2405.07920v2)
Published 13 May 2024 in cs.IR
Abstract: Cross-encoders distilled from LLMs are often more effective re-rankers than cross-encoders fine-tuned on manually labeled data. However, the distilled models usually do not reach their teacher LLM's effectiveness. To investigate whether best practices for fine-tuning cross-encoders on manually labeled data (e.g., hard-negative sampling, deep sampling, and listwise loss functions) can help to improve LLM ranker distillation, we construct and release a new distillation dataset: Rank-DistiLLM. In our experiments, cross-encoders trained on Rank-DistiLLM reach the effectiveness of LLMs while being orders of magnitude more efficient. Our code and data is available at https://github.com/webis-de/msmarco-LLM-distillation.
- Ferdinand Schlatt (5 papers)
- Maik Fröbe (20 papers)
- Harrisen Scells (22 papers)
- Shengyao Zhuang (42 papers)
- Bevan Koopman (37 papers)
- Guido Zuccon (73 papers)
- Benno Stein (44 papers)
- Martin Potthast (64 papers)
- Matthias Hagen (33 papers)