Typo-Robust Representation Learning for Dense Retrieval (2306.10348v1)

Published 17 Jun 2023 in cs.IR and cs.CL

Abstract: Dense retrieval is a basic building block of information retrieval applications. One of the main challenges of dense retrieval in real-world settings is the handling of queries containing misspelled words. A popular approach for handling misspelled queries is minimizing the representations discrepancy between misspelled queries and their pristine ones. Unlike the existing approaches, which only focus on the alignment between misspelled and pristine queries, our method also improves the contrast between each misspelled query and its surrounding queries. To assess the effectiveness of our proposed method, we compare it against the existing competitors using two benchmark datasets and two base encoders. Our method outperforms the competitors in all cases with misspelled queries. Our code and models are available at https://github. com/panuthept/DST-DenseRetrieval.

References (19)

Citations (4)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

GitHub

GitHub - panuthept/DST-DenseRetrieval: Typo-Robust Sentence Representation Learning for Dense Retrieval (6 stars)

Typo-Robust Representation Learning for Dense Retrieval (2306.10348v1)

Summary

Related Papers

GitHub