Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 119 tok/s
Gemini 2.5 Pro 51 tok/s Pro
GPT-5 Medium 27 tok/s Pro
GPT-5 High 17 tok/s Pro
GPT-4o 60 tok/s Pro
Kimi K2 196 tok/s Pro
GPT OSS 120B 423 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Rethinking E-Commerce Search (2312.03217v1)

Published 6 Dec 2023 in cs.IR and cs.CL

Abstract: E-commerce search and recommendation usually operate on structured data such as product catalogs and taxonomies. However, creating better search and recommendation systems often requires a large variety of unstructured data including customer reviews and articles on the web. Traditionally, the solution has always been converting unstructured data into structured data through information extraction, and conducting search over the structured data. However, this is a costly approach that often has low quality. In this paper, we envision a solution that does entirely the opposite. Instead of converting unstructured data (web pages, customer reviews, etc) to structured data, we instead convert structured data (product inventory, catalogs, taxonomies, etc) into textual data, which can be easily integrated into the text corpus that trains LLMs. Then, search and recommendation can be performed through a Q/A mechanism through an LLM instead of using traditional information retrieval methods over structured data.

Citations (2)

Summary

  • The paper proposes a paradigm shift by converting structured product data into annotated text to integrate LLMs for search and recommendation tasks.
  • It introduces universal IDs and text annotations to improve query understanding and product retrieval by leveraging LLMs' world knowledge.
  • The work also outlines future research directions such as latency optimization, personalization, and mitigating catastrophic forgetting.

Introduction

The paper "Rethinking E-Commerce Search" presents a novel approach to the challenges of search and recommendation systems in the e-commerce domain. The traditional method converts unstructured data into structured formats to facilitate search and retrieval. The authors propose an alternative approach: inversely converting structured data into text, enabling integration with LLMs for search and recommendation purposes.

E-commerce search is primarily dominated by structured data from product catalogs. Two main challenges are highlighted: query understanding and product understanding. Traditional methods face difficulties in extracting user intent from short search queries and relating them to product attributes. Furthermore, capturing the necessary world knowledge for understanding products beyond basic features proves challenging.

Vision for a New Approach

The paper proposes a paradigm shift wherein structured and semi-structured data is transformed into text. This allows LLMs, pretrained on extensive text corpora, to handle search and recommendation tasks through question-and-answer mechanisms. LLMs possess an inherent understanding of world knowledge, bypassing the need for complex product knowledge graphs and dedicated query understanding systems.

Technical Implementation

  1. Universal IDs: Establishing universal IDs for database entities embedded into text allows LLMs to refer to these during query processing. The authors discuss methods for ID representation to mitigate the issues of large ID spaces and prior knowledge interference.
  2. Annotated Text Generation: The conversion of structured data into annotated texts incorporates entity IDs into textual descriptions. This is achieved through manually created templates, LLM-generated descriptions, and leveraging user engagement data for query-based templates.
  3. System Architecture: The framework involves ingesting the annotated texts into LLMs during training, with the models being fine-tuned to transfer database knowledge into an LLM while maintaining world knowledge and linguistic capabilities.

Inference and Applications

The proposed system utilizes LLMs for retrieval and recommendation at inference time through various configurations, such as zero-shot and few-shot learning. The paper illustrates potential use cases like product search retrieval, recommendations, and search suggestions, using specific prompts designed to elicit responses with linked product IDs.

Research Directions

The authors outline several areas for further investigation, including:

  • Latency Optimization: Strategies to reduce response times, including encoder-based approaches and model compression via distillation, pruning, and quantization.
  • Personalization: Embedding user history features as input context to enhance personalized search experiences.
  • Catastrophic Forgetting: Addressing the issue of retaining learned information even as product databases evolve and expand over time.

Conclusion

The paper posits a transformative shift in e-commerce search by leveraging LLMs to manage both structured product information and broad world knowledge. This fusion aims to streamline complex systems and enhance retrieval efficacy, projecting a unified model infrastructure as the future of e-commerce information systems. Future research will likely explore optimizing these integrations, enhancing latency, scalability, and personalization.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Authors (2)

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.