Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Dataset of Natural Language Queries for E-Commerce (2302.06355v1)

Published 13 Feb 2023 in cs.IR

Abstract: Shopping online is more and more frequent in our everyday life. For e-commerce search systems, understanding natural language coming through voice assistants, chatbots or from conversational search is an essential ability to understand what the user really wants. However, evaluation datasets with natural and detailed information needs of product-seekers which could be used for research do not exist. Due to privacy issues and competitive consequences, only few datasets with real user search queries from logs are openly available. In this paper, we present a dataset of 3,540 natural language queries in two domains that describe what users want when searching for a laptop or a jacket of their choice. The dataset contains annotations of vague terms and key facts of 1,754 laptop queries. This dataset opens up a range of research opportunities in the fields of natural language processing and (interactive) information retrieval for product search.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Andrea Papenmeier (9 papers)
  2. Dagmar Kern (17 papers)
  3. Daniel Hienert (34 papers)
  4. Alfred Sliwa (3 papers)
  5. Ahmet Aker (9 papers)
  6. Norbert Fuhr (15 papers)
Citations (7)

Summary

We haven't generated a summary for this paper yet.