A Systematic Investigation of Distilling Large Language Models into Cross-Encoders for Passage Re-ranking (2405.07920v2)

Published 13 May 2024 in cs.IR

Abstract: Cross-encoders distilled from LLMs are often more effective re-rankers than cross-encoders fine-tuned on manually labeled data. However, the distilled models usually do not reach their teacher LLM's effectiveness. To investigate whether best practices for fine-tuning cross-encoders on manually labeled data (e.g., hard-negative sampling, deep sampling, and listwise loss functions) can help to improve LLM ranker distillation, we construct and release a new distillation dataset: Rank-DistiLLM. In our experiments, cross-encoders trained on Rank-DistiLLM reach the effectiveness of LLMs while being orders of magnitude more efficient. Our code and data is available at https://github.com/webis-de/msmarco-LLM-distillation.

Authors (9)

Ferdinand Schlatt (5 papers)
Maik Fröbe (20 papers)
Harrisen Scells (22 papers)
Shengyao Zhuang (42 papers)
Bevan Koopman (37 papers)
Guido Zuccon (73 papers)
Benno Stein (44 papers)
Martin Potthast (64 papers)
Matthias Hagen (33 papers)

Citations (3)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tweets

https://twitter.com/_reachsumit/status/1790243401201561923

https://twitter.com/tomaarsen/status/1906652879068234188

https://twitter.com/n0riskn0r3ward/status/1859407922372960417

https://twitter.com/MrParryParry/status/1835978387359158421

A Systematic Investigation of Distilling Large Language Models into Cross-Encoders for Passage Re-ranking (2405.07920v2)

Summary

Related Papers

Tweets