Leveraging Decoder Architectures for Learned Sparse Retrieval (2504.18151v1)

Published 25 Apr 2025 in cs.IR

Abstract: Learned Sparse Retrieval (LSR) has traditionally focused on small-scale encoder-only transformer architectures. With the advent of large-scale pre-trained LLMs, their capability to generate sparse representations for retrieval tasks across different transformer-based architectures, including encoder-only, decoder-only, and encoder-decoder models, remains largely unexplored. This study investigates the effectiveness of LSR across these architectures, exploring various sparse representation heads and model scales. Our results highlight the limitations of using LLMs to create effective sparse representations in zero-shot settings, identifying challenges such as inappropriate term expansions and reduced performance due to the lack of expansion. We find that the encoder-decoder architecture with multi-tokens decoding approach achieves the best performance among the three backbones. While the decoder-only model performs worse than the encoder-only model, it demonstrates the potential to outperform when scaled to a high number of parameters.

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Generate Now

Authors (4)

Tweets

https://twitter.com/_reachsumit/status/1916715291763212556

Leveraging Decoder Architectures for Learned Sparse Retrieval (2504.18151v1)

Summary

Follow-up Questions

Related Papers

Authors (4)

Tweets

Don't miss out on important new AI/ML research