Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 93 tok/s

Gemini 2.5 Pro 48 tok/s Pro

GPT-5 Medium 30 tok/s Pro

GPT-5 High 33 tok/s Pro

GPT-4o 128 tok/s Pro

Kimi K2 202 tok/s Pro

GPT OSS 120B 449 tok/s Pro

Claude Sonnet 4.5 37 tok/s Pro

2000 character limit reached

Investigating Multi-layer Representations for Dense Passage Retrieval (2509.23861v1)

Published 28 Sep 2025 in cs.IR

Abstract: Dense retrieval models usually adopt vectors from the last hidden layer of the document encoder to represent a document, which is in contrast to the fact that representations in different layers of a pre-trained LLM usually contain different kinds of linguistic knowledge, and behave differently during fine-tuning. Therefore, we propose to investigate utilizing representations from multiple encoder layers to make up the representation of a document, which we denote Multi-layer Representations (MLR). We first investigate how representations in different layers affect MLR's performance under the multi-vector retrieval setting, and then propose to leverage pooling strategies to reduce multi-vector models to single-vector ones to improve retrieval efficiency. Experiments demonstrate the effectiveness of MLR over dual encoder, ME-BERT and ColBERT in the single-vector retrieval setting, as well as demonstrate that it works well with other advanced training techniques such as retrieval-oriented pre-training and hard negative mining.