Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Capturing Conversion Rate Fluctuation during Sales Promotions: A Novel Historical Data Reuse Approach (2305.12837v3)

Published 22 May 2023 in cs.IR, cs.AI, and cs.LG

Abstract: Conversion rate (CVR) prediction is one of the core components in online recommender systems, and various approaches have been proposed to obtain accurate and well-calibrated CVR estimation. However, we observe that a well-trained CVR prediction model often performs sub-optimally during sales promotions. This can be largely ascribed to the problem of the data distribution shift, in which the conventional methods no longer work. To this end, we seek to develop alternative modeling techniques for CVR prediction. Observing similar purchase patterns across different promotions, we propose reusing the historical promotion data to capture the promotional conversion patterns. Herein, we propose a novel \textbf{H}istorical \textbf{D}ata \textbf{R}euse (\textbf{HDR}) approach that first retrieves historically similar promotion data and then fine-tunes the CVR prediction model with the acquired data for better adaptation to the promotion mode. HDR consists of three components: an automated data retrieval module that seeks similar data from historical promotions, a distribution shift correction module that re-weights the retrieved data for better aligning with the target promotion, and a TransBlock module that quickly fine-tunes the original model for better adaptation to the promotion mode. Experiments conducted with real-world data demonstrate the effectiveness of HDR, as it improves both ranking and calibration metrics to a large extent. HDR has also been deployed on the display advertising system in Alibaba, bringing a lift of $9\%$ RPM and $16\%$ CVR during Double 11 Sales in 2022.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (42)
  1. Maximum likelihood with bias-corrected calibration is hard-to-beat at label shift adaptation. In International Conference on Machine Learning. PMLR, 222–232.
  2. Regularized learning for domain adaptation under label shifts. arXiv preprint arXiv:1903.09734 (2019).
  3. Adapting to Online Label Shift with Provable Guarantees. arXiv preprint arXiv:2207.02121 (2022).
  4. CAN: feature co-action network for click-through rate prediction. In Proceedings of the fifteenth ACM international conference on web search and data mining. 57–65.
  5. Discriminative learning under covariate shift. Journal of Machine Learning Research 10, 9 (2009).
  6. Olivier Chapelle. 2014. Modeling delayed feedback in display advertising. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. 1097–1105.
  7. Contrastive test-time adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 295–305.
  8. Asymptotically Unbiased Estimation for Delayed Feedback Modeling via Label Correction. In Proceedings of the ACM Web Conference 2022. 369–379.
  9. Improving test-time adaptation via shift-agnostic weight regularization and nearest source prototypes. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXXIII. Springer, 440–458.
  10. Real Negatives Matter: Continuous Training with Real Negatives for Delayed Feedback Modeling. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 2890–2898.
  11. Correcting sample selection bias by unlabeled data. Advances in neural information processing systems 19 (2006).
  12. MBCT: Tree-Based Feature-Aware Binning for Individual Uncertainty Calibration. In Proceedings of the ACM Web Conference 2022. 2236–2246.
  13. A Multi-Task Learning Approach for Delayed Feedback Modeling. In Companion Proceedings of the Web Conference 2022. 116–120.
  14. Xdl: an industrial deep learning framework for high-dimensional sparse data. In Proceedings of the 1st International Workshop on Deep Learning Practice for High-Dimensional Sparse Data. 1–9.
  15. Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
  16. Addressing delayed feedback for continuous training with neural networks in CTR prediction. In Proceedings of the 13th ACM conference on recommender systems. 187–195.
  17. Detecting and correcting for label shift with black box predictors. In International conference on machine learning. PMLR, 3122–3130.
  18. TTT++: When does self-supervised test-time training fail or thrive? Advances in Neural Information Processing Systems 34 (2021), 21808–21820.
  19. What Do We Need for Industrial Machine Learning Systems? Bernoulli, A Streaming System with Structured Designs. (2021).
  20. Entire space multi-task model: An effective approach for estimating post-click conversion rate. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. 1137–1140.
  21. Efficient test-time model adaptation without forgetting. In International conference on machine learning. PMLR, 16888–16905.
  22. On causal and anticausal learning. arXiv preprint arXiv:1206.6471 (2012).
  23. Introduction to information retrieval. Vol. 39. Cambridge University Press Cambridge.
  24. Joint Optimization of Ranking and Calibration with Contextualized Hybrid Model. arXiv preprint arXiv:2208.06164 (2022).
  25. One Model to Serve All: Star Topology Adaptive Recommender for Multi-Domain CTR Prediction. arXiv:2101.11427 [cs.IR]
  26. Hidetoshi Shimodaira. 2000a. Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of statistical planning and inference 90, 2 (2000), 227–244.
  27. H. Shimodaira. 2000b. Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of Statistical Planning and Inference 90, 2 (2000), 227–244.
  28. Tent: Fully test-time adaptation by entropy minimization. arXiv preprint arXiv:2006.10726 (2020).
  29. Delayed feedback modeling for the entire space conversion rate prediction. arXiv preprint arXiv:2011.11826 (2020).
  30. Hierarchically modeling micro and macro behaviors via multi-task learning for conversion rate prediction. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2187–2191.
  31. Entire space multi-task modeling via post-click behavior decomposition for conversion rate prediction. In Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval. 2377–2386.
  32. Adversarial Gradient Driven Exploration for Deep Click-Through Rate Prediction. arXiv:2112.11136 [cs.IR]
  33. Online adaptation to label distribution shift. Advances in Neural Information Processing Systems 34 (2021), 11340–11351.
  34. UKD: Debiasing Conversion Rate Estimation via Uncertainty-regularized Knowledge Distillation. In Proceedings of the ACM Web Conference 2022. 2078–2087.
  35. Scale Calibration of Deep Ranking Models. In KDD. 4300–4309.
  36. Jiaqi Yang and De-Chuan Zhan. 2022. Generalized Delayed Feedback Model with Post-Click Information in Recommender Systems. Advances in Neural Information Processing Systems 35 (2022), 26192–26203.
  37. Capturing delayed feedback in conversion rate prediction via elapsed-time sampling. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 4582–4589.
  38. KEEP: An Industrial Pre-Training Framework for Online Recommendation via Knowledge Extraction and Plugging. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management. 3684–3693.
  39. PICASSO: Unleashing the Potential of GPU-centric Training for Wide-and-deep Recommender Systems. In 2022 IEEE 38th International Conference on Data Engineering (ICDE). IEEE.
  40. Towards Understanding the Overfitting Phenomenon of Deep Click-Through Rate Models. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management. 2671–2680.
  41. Deep interest network for click-through rate prediction. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining. 1059–1068.
  42. Open benchmarking for click-through rate prediction. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management. 2759–2769.
Citations (7)

Summary

We haven't generated a summary for this paper yet.