Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Sparsifying Sparse Representations for Passage Retrieval by Top-$k$ Masking (2112.09628v1)

Published 17 Dec 2021 in cs.IR and cs.CL

Abstract: Sparse lexical representation learning has demonstrated much progress in improving passage retrieval effectiveness in recent models such as DeepImpact, uniCOIL, and SPLADE. This paper describes a straightforward yet effective approach for sparsifying lexical representations for passage retrieval, building on SPLADE by introducing a top-$k$ masking scheme to control sparsity and a self-learning method to coax masked representations to mimic unmasked representations. A basic implementation of our model is competitive with more sophisticated approaches and achieves a good balance between effectiveness and efficiency. The simplicity of our methods opens the door for future explorations in lexical representation learning for passage retrieval.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Jheng-Hong Yang (14 papers)
  2. Xueguang Ma (36 papers)
  3. Jimmy Lin (208 papers)
Citations (12)

Summary

We haven't generated a summary for this paper yet.