Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Enriching Conversation Context in Retrieval-based Chatbots (1911.02290v1)

Published 6 Nov 2019 in cs.CL

Abstract: Work on retrieval-based chatbots, like most sequence pair matching tasks, can be divided into Cross-encoders that perform word matching over the pair, and Bi-encoders that encode the pair separately. The latter has better performance, however since candidate responses cannot be encoded offline, it is also much slower. Lately, multi-layer transformer architectures pre-trained as LLMs have been used to great effect on a variety of natural language processing and information retrieval tasks. Recent work has shown that these LLMs can be used in text-matching scenarios to create Bi-encoders that perform almost as well as Cross-encoders while having a much faster inference speed. In this paper, we expand upon this work by developing a sequence matching architecture that %takes into account contexts in the training dataset at inference time. utilizes the entire training set as a makeshift knowledge-base during inference. We perform detailed experiments demonstrating that this architecture can be used to further improve Bi-encoders performance while still maintaining a relatively high inference speed.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Amir Vakili Tahami (2 papers)
  2. Azadeh Shakery (26 papers)
Citations (3)