Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Evaluating Token-Level and Passage-Level Dense Retrieval Models for Math Information Retrieval (2203.11163v2)

Published 21 Mar 2022 in cs.IR

Abstract: With the recent success of dense retrieval methods based on bi-encoders, studies have applied this approach to various interesting downstream retrieval tasks with good efficiency and in-domain effectiveness. Recently, we have also seen the presence of dense retrieval models in Math Information Retrieval (MIR) tasks, but the most effective systems remain classic retrieval methods that consider hand-crafted structure features. In this work, we try to combine the best of both worlds:\ a well-defined structure search method for effective formula search and efficient bi-encoder dense retrieval models to capture contextual similarities. Specifically, we have evaluated two representative bi-encoder models for token-level and passage-level dense retrieval on recent MIR tasks. Our results show that bi-encoder models are highly complementary to existing structure search methods, and we are able to advance the state-of-the-art on MIR datasets.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Wei Zhong (88 papers)
  2. Jheng-Hong Yang (14 papers)
  3. Yuqing Xie (24 papers)
  4. Jimmy Lin (208 papers)
Citations (16)

Summary

We haven't generated a summary for this paper yet.