Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A/B Testing for Recommender Systems in a Two-sided Marketplace (2106.00762v2)

Published 29 May 2021 in cs.SI, stat.AP, and stat.ME

Abstract: Two-sided marketplaces are standard business models of many online platforms (e.g., Amazon, Facebook, LinkedIn), wherein the platforms have consumers, buyers or content viewers on one side and producers, sellers or content-creators on the other. Consumer side measurement of the impact of a treatment variant can be done via simple online A/B testing. Producer side measurement is more challenging because the producer experience depends on the treatment assignment of the consumers. Existing approaches for producer side measurement are either based on graph cluster-based randomization or on certain treatment propagation assumptions. The former approach results in low-powered experiments as the producer-consumer network density increases and the latter approach lacks a strict notion of error control. In this paper, we propose (i) a quantification of the quality of a producer side experiment design, and (ii) a new experiment design mechanism that generates high-quality experiments based on this quantification. Our approach, called UniCoRn (Unifying Counterfactual Rankings), provides explicit control over the quality of the experiment and its computation cost. Further, we prove that our experiment design is optimal to the proposed design quality measure. Our approach is agnostic to the density of the producer-consumer network and does not rely on any treatment propagation assumption. Moreover, unlike the existing approaches, we do not need to know the underlying network in advance, making this widely applicable to the industrial setting where the underlying network is unknown and challenging to predict a priori due to its dynamic nature. We use simulations to validate our approach and compare it against existing methods. We also deployed UniCoRn in an edge recommendation application that serves tens of millions of members and billions of edge recommendations daily.

Citations (16)

Summary

  • The paper presents UniCoRn, a novel approach for high-quality A/B testing that bridges the gap between consumer and producer experiments.
  • It introduces a quantitative measure for experiment design quality, eliminating the need for prior network structure knowledge while controlling errors.
  • Validation through simulations and real-world deployment in large-scale platforms confirms UniCoRn’s optimal performance and scalability.

The paper "A/B Testing for Recommender Systems in a Two-sided Marketplace" addresses the complexities in performing A/B testing within two-sided online platforms. These platforms include companies like Amazon, Facebook, and LinkedIn, where consumers and producers interact within a shared ecosystem. The primary challenge arises from the intricate dependencies between consumers (buyers or content viewers) and producers (sellers or content creators).

Key Contributions:

  1. Producer-Side Experiment Challenges: Traditional A/B testing methods work well for the consumer side but pose significant challenges for the producer side because producers' experiences are indirectly influenced by the treatment assignment to consumers. Existing methods, such as graph cluster-based randomization, have limitations, particularly in high-density networks, which lead to low-powered experiments. Other methods based on treatment propagation assumptions fail to maintain strict error control.
  2. Quality of Experiment Design: The paper introduces a quantitative measure for the quality of an experiment design for producer-side testing. This measure facilitates better control and understanding of the experimental outcomes.
  3. UniCoRn Approach: The proposed method, UniCoRn (Unifying Counterfactual Rankings), is designed to generate high-quality experiments. UniCoRn focuses on explicit control over both the quality and computational cost of the experiment. It is noteworthy for its agnosticism towards the density of the producer-consumer network and its independence from any specific treatment propagation assumptions.
  4. Optimality and Applicability: UniCoRn is proven to be optimal concerning the proposed quality measure. Unlike existing methods, it does not require prior knowledge of the underlying network structure. This characteristic makes UniCoRn highly practical for real-world industrial applications where network dynamics are both complex and constantly changing.

Validation and Deployment:

The approach was validated through simulations and shown to outperform existing methods. Moreover, UniCoRn was deployed in a real-world scenario involving an edge recommendation application, which serves a vast audience (tens of millions of members) and handles billions of recommendations daily. The real-world deployment underscored the practical viability and effectiveness of UniCoRn in optimizing producer-side A/B testing in large-scale, dynamic marketplaces.

Implications:

This work has significant implications for improving the accuracy and reliability of A/B testing in two-sided markets. By providing a robust and computationally feasible method to account for the intricate interdependencies between consumers and producers, it opens up new possibilities for enhancing the effectiveness of recommendations and overall user satisfaction on these platforms.

In summary, the UniCoRn approach represents a substantial advancement in the field of experimental design for recommender systems within two-sided marketplaces, offering a powerful tool for achieving high-quality, reliable experimental results without the need for intricate knowledge of underlying network structures.