On the Factory Floor: ML Engineering for Industrial-Scale Ads Recommendation Models (2209.05310v1)

Published 12 Sep 2022 in cs.IR and cs.LG

Abstract: For industrial-scale advertising systems, prediction of ad click-through rate (CTR) is a central problem. Ad clicks constitute a significant class of user engagements and are often used as the primary signal for the usefulness of ads to users. Additionally, in cost-per-click advertising systems where advertisers are charged per click, click rate expectations feed directly into value estimation. Accordingly, CTR model development is a significant investment for most Internet advertising companies. Engineering for such problems requires many ML techniques suited to online learning that go well beyond traditional accuracy improvements, especially concerning efficiency, reproducibility, calibration, credit attribution. We present a case study of practical techniques deployed in Google's search ads CTR model. This paper provides an industry case study highlighting important areas of current ML research and illustrating how impactful new ML methods are evaluated and made useful in a large-scale industrial setting.

Authors (12)

Rohan Anil (32 papers)
Sandra Gadanho (1 paper)
Da Huang (67 papers)
Nijith Jacob (1 paper)
Zhuoshu Li (7 papers)
Dong Lin (15 papers)
Todd Phillips (2 papers)
Cristina Pop (5 papers)
Kevin Regan (5 papers)
Gil I. Shamir (11 papers)
Rakesh Shivanna (10 papers)
Qiqi Yan (12 papers)

Citations (34)

View on Semantic Scholar

Summary

Industrial-Scale CTR Prediction for Ads Recommendation

The paper, "On the Factory Floor: ML Engineering for Industrial-Scale Ads Recommendation Models," presents a comprehensive paper on the engineering and deployment of Click-Through Rate (CTR) prediction systems in a large-scale industrial context, focusing specifically on Google's search ads models. The authors address the core challenges in designing industrial-scale prediction systems for CTR that involve billions of weights with extensive data inputs that require high-speed inference. This discussion offers insights into practical techniques, emphasizing efficiency and accuracy without compromising the system's maintainability.

Context and Importance

CTR prediction is a pivotal problem in online advertisement systems because ad clicks provide a primary metric of ad usefulness and economic value estimation in cost-per-click advertising. The paper underscores that designing CTR models for these complex systems necessitates a balance of accuracy improvements with operational costs. The combination of billions of data points and stringent real-time performance constraints drives the need for robust ML engineering practices that optimize both model precision and infrastructure effectiveness.

Technical and Methodological Insights

The authors present several sophisticated methods used in the practical implementation of Google's search ads CTR prediction model. These include:

Model Representation & Feature Engineering: The discussion highlights the importance of representing ad-query pairs efficiently by employing techniques such as attention layers and DNNs combined with feature generation techniques like bi-grams and n-grams. This approach effectively handles the sparse nature of ad click data without escalating computation costs beyond practicality.
Online Optimization and Efficiency: Given the non-stationarity of ad click data, the authors emphasize the use of online learning, where training occurs sequentially in real-time. A single-pass streaming algorithm is implemented to efficiently manage the data-intensive tasks of predictions and model updates.
Improving Accuracy and Reproducibility: The paper details multiple strategies aimed at enhancing model accuracy, including rank losses, distillation with teacher-student models setup, and second-order optimization using Distributed Shampoo. These methodologies are shown to yield substantial improvements in accuracy while maintaining system cost-effectiveness.
Generalization Across UI Treatments: The integration of model factorization allows separate optimization of ad quality and user interface (UI) features, providing a mechanism to explore multivariate presentation strategies efficiently. This separation enables more flexible experimentation and optimizes UI treatment selection without significant performance degradation.
Mitigating Irreproducibility: The challenge of irreproducibility is addressed by applying techniques such as smooth activation functions and bias constraints, which reduce non-identifiability and variability across retrained models, leading to consistent system outcomes.

Implications and Future Work

The engineering insights and practices in this paper reflect the inherent complexities and collaborative nature of building ML systems at industrial scales. The authors articulate a landscape where theoretical ML advances are consistently adapted into practical solutions, overcoming the unique demands of high-throughput systems and non-deterministic data environments.

In consideration of future developments, the paper implies the need for continued exploration of more sophisticated model architectures and optimization techniques that sustain computational feasibility while pushing forward accuracy borders. Further research could delve into refining model calibration to enhance downstream applications, such as personalized ad recommendations or dynamic pricing strategies.

Overall, this work exemplifies the intersection of cutting-edge research and scalable engineering, showcasing practical strategies that optimize both computational resources and model performance to meet the rigorous demands of modern digital advertising systems.

PDF Markdown

Related Papers

Tweets

https://twitter.com/cloneofsimo/status/1833308447275946116

https://twitter.com/zacharynado/status/1819112612237324744

https://twitter.com/nzhiltsov/status/1934271878149120167