Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Improving Domain-Specific Retrieval by NLI Fine-Tuning (2308.03103v1)

Published 6 Aug 2023 in cs.CL and cs.IR

Abstract: The aim of this article is to investigate the fine-tuning potential of natural language inference (NLI) data to improve information retrieval and ranking. We demonstrate this for both English and Polish languages, using data from one of the largest Polish e-commerce sites and selected open-domain datasets. We employ both monolingual and multilingual sentence encoders fine-tuned by a supervised method utilizing contrastive loss and NLI data. Our results point to the fact that NLI fine-tuning increases the performance of the models in both tasks and both languages, with the potential to improve mono- and multilingual models. Finally, we investigate uniformity and alignment of the embeddings to explain the effect of NLI-based fine-tuning for an out-of-domain use-case.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Roman DuĊĦek (1 paper)
  2. Aleksander Wawer (3 papers)
  3. Christopher Galias (3 papers)
  4. Lidia Wojciechowska (1 paper)
Citations (1)

Summary

We haven't generated a summary for this paper yet.