Semantic Product Search

Published 1 Jul 2019 in cs.IR and cs.CL | (1907.00937v1)

Abstract: We study the problem of semantic matching in product search, that is, given a customer query, retrieve all semantically related products from the catalog. Pure lexical matching via an inverted index falls short in this respect due to several factors: a) lack of understanding of hypernyms, synonyms, and antonyms, b) fragility to morphological variants (e.g. "woman" vs. "women"), and c) sensitivity to spelling errors. To address these issues, we train a deep learning model for semantic matching using customer behavior data. Much of the recent work on large-scale semantic search using deep learning focuses on ranking for web search. In contrast, semantic matching for product search presents several novel challenges, which we elucidate in this paper. We address these challenges by a) developing a new loss function that has an inbuilt threshold to differentiate between random negative examples, impressed but not purchased examples, and positive examples (purchased items), b) using average pooling in conjunction with n-grams to capture short-range linguistic patterns, c) using hashing to handle out of vocabulary tokens, and d) using a model parallel training architecture to scale across 8 GPUs. We present compelling offline results that demonstrate at least 4.7% improvement in Recall@100 and 14.5% improvement in mean average precision (MAP) over baseline state-of-the-art semantic search methods using the same tokenization method. Moreover, we present results and discuss learnings from online A/B tests which demonstrate the efficacy of our method.

Abstract PDF Upgrade to Chat

Citations (111)

View on Semantic Scholar

Summary

The paper introduces a deep learning approach for semantic product search using a Siamese network architecture.
The novel 3-part hinge loss distinguishes between purchased, semi-negative, and negative products, enhancing search precision.
Advanced tokenization and hashing techniques are applied to handle lexical variations, achieving significant gains in Recall@100 and MAP.

Semantic Product Search

Introduction

The paper "Semantic Product Search" addresses the problem of enhancing semantic matching in product search by utilizing deep learning techniques. Traditional lexical matching methods are insufficient due to their inability to handle nuances such as synonyms, hypernyms, antagonistic meanings, and spelling errors. This paper presents methodologies to overcome these obstacles by training deep learning models using customer behavior data, thereby capturing semantic relationships and significantly improving product search efficacy.

Model Architecture and Loss Function

The proposed model follows a Siamese neural network architecture with shared embeddings between queries and products, ensuring effective word-level matching. The model employs average pooling, which, contrary to expectations with more complex architectures (e.g., LSTMs, CNNs), provides sufficient capability for capturing short-range linguistic dependencies typical in product titles and user queries.

A significant contribution of the paper is the introduction of a novel 3-part hinge loss function. This loss function distinguishes between purchased (positive), impressed-but-not-purchased (semi-negative), and random non-purchased products (negative). It ensures that the embeddings for purchased products are more similar to the query than the embeddings for impressed and random products. This nuanced approach increases model precision and generalizes better to unseen data.

Tokenization and Hashing

A sophisticated tokenization strategy is employed to transform product descriptions and queries into reliable input features for the model. The strategy combines word unigrams, bigrams, character trigrams, and employs hashing techniques for handling out-of-vocabulary (OOV) words. This method not only captures the semantic intent but also provides robustness against typographical errors and unseen terms.

The use of character trigrams is particularly effective for handling morphological variations and spelling errors. Hashing OOV words ensures that the model can handle unseen inputs gracefully, avoiding the pitfall of treating all unknown tokens the same.

Evaluation Metrics and Results

The study evaluates the model's performance using Recall@100 and Mean Average Precision (MAP) for matching, and additional metrics like NDCG and MRR for ranking tasks. The results demonstrate substantial improvements over state-of-the-art semantic search baseline models, achieving at least a 4.7% improvement in Recall@100 and a remarkable 14.5% improvement in MAP. These enhancements are crucial for meeting the precision standards required for industry's product search applications.

Online Experiments and Practical Implications

Online A/B testing trials were conducted across various product categories like toys, games, and kitchen products, yielding statistically significant improvements in conversion rates and revenue. Such tests underscore the model's capability to enhance user experience by returning more semantically relevant search results, ultimately leading to increased customer satisfaction and business performance.

Training Accelerations

To handle vast datasets efficiently, the authors adopt a model parallelism approach, leveraging multiple GPUs to distribute computational burdens without linearly increasing communication costs. This strategy enables the training of complex models with larger embedding sizes, essential for capturing the intricacies in product search scenarios.

Conclusion

The paper presents a comprehensive approach to semantic product search that effectively leverages deep learning techniques to improve search precision and recall. By addressing lexical matching limitations, this method enhances the quality of search results and helps customers find products more efficiently. Future research could focus on further improving precision and incorporating advanced mechanisms like attention models to refine semantic understanding during the search process.

Markdown