Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Billion-scale Commodity Embedding for E-commerce Recommendation in Alibaba (1803.02349v2)

Published 6 Mar 2018 in cs.IR and cs.AI

Abstract: Recommender systems (RSs) have been the most important technology for increasing the business in Taobao, the largest online consumer-to-consumer (C2C) platform in China. The billion-scale data in Taobao creates three major challenges to Taobao's RS: scalability, sparsity and cold start. In this paper, we present our technical solutions to address these three challenges. The methods are based on the graph embedding framework. We first construct an item graph from users' behavior history. Each item is then represented as a vector using graph embedding. The item embeddings are employed to compute pairwise similarities between all items, which are then used in the recommendation process. To alleviate the sparsity and cold start problems, side information is incorporated into the embedding framework. We propose two aggregation methods to integrate the embeddings of items and the corresponding side information. Experimental results from offline experiments show that methods incorporating side information are superior to those that do not. Further, we describe the platform upon which the embedding methods are deployed and the workflow to process the billion-scale data in Taobao. Using online A/B test, we show that the online Click-Through-Rate (CTRs) are improved comparing to the previous recommendation methods widely used in Taobao, further demonstrating the effectiveness and feasibility of our proposed methods in Taobao's live production environment.

Citations (466)

Summary

  • The paper introduces a graph embedding framework that constructs item graphs from session-based user data to improve recommendation accuracy.
  • The embedding methods incorporate side information such as categories and brands to effectively mitigate sparsity and cold start issues.
  • Experimental results, including significant CTR gains on Taobao, validate the scalability and efficacy of the proposed models.

Overview of "Billion-scale Commodity Embedding for E-commerce Recommendation in Alibaba"

The paper "Billion-scale Commodity Embedding for E-commerce Recommendation in Alibaba" by Wang et al. presents a comprehensive approach to addressing the challenges faced by recommender systems (RSs) in the context of Alibaba's vast e-commerce platform. The primary focus is on tackling issues of scalability, sparsity, and cold start in the context of Taobao, Alibaba's major consumer-to-consumer platform, which handles interactions for over one billion users and two billion items.

Key Technical Contributions

The authors have introduced a graph embedding framework to improve recommendations. This involves constructing an item graph from user behavior data and applying embedding techniques to capture item similarities. The key aspects of the research include:

  1. Graph Construction and Embedding:
    • An item graph is constructed using session-based user behavior, where nodes represent items and edges denote consecutive interactions.
    • Various graph embedding methods, such as Base Graph Embedding (BGE), Graph Embedding with Side Information (GES), and Enhanced Graph Embedding with Side Information (EGES) are proposed to generate low-dimensional embeddings for items, enabling efficient computation of item similarities.
  2. Integration of Side Information:
    • The embedding models incorporate side information to alleviate sparsity and cold start problems, enhancing embedding accuracy for items with limited user interactions.
    • Side information includes categories, brands, and other attributes, influencing item representations and contributing to more accurate predictions.
  3. Experimental Evaluation:
    • Offline experiments using metrics like AUC were conducted on datasets from both Amazon and Taobao, demonstrating the superiority of GES and EGES over baseline models.
    • Online A/B testing showed significant improvements in Click-Through-Rates (CTRs) on Taobao's mobile app, with EGES outperforming other methods.
  4. Deployment and Scalability:
    • The embedding methods are deployed on Alibaba’s production system leveraging a distributed platform, ensuring efficient processing of billion-scale data.

Implications for Recommender Systems

The results showcase the potential of integrating graph embedding with additional side information in enhancing the performance of RSs, especially for large datasets. The work offers practical solutions for real-time applications in e-commerce, enabling better personalization and user engagement by improving the relevance of recommendations.

Future Directions

The authors suggest further exploration into advanced aggregation techniques like attention mechanisms for processing side information. Moreover, there is potential to incorporate textual data, such as user reviews, to enrich the item representations further.

In summary, this paper contributes significantly to the field of e-commerce RSs, offering a scalable and effective approach to addressing common challenges and improving the overall user experience in large-scale platforms like Taobao. The proposed methods not only enhance recommendation quality but also offer a viable solution for integrating diverse types of side information in the embedding process.