Entire Chain Uplift Modeling with Context-Enhanced Learning for Intelligent Marketing (2402.03379v1)
Abstract: Uplift modeling, vital in online marketing, seeks to accurately measure the impact of various strategies, such as coupons or discounts, on different users by predicting the Individual Treatment Effect (ITE). In an e-commerce setting, user behavior follows a defined sequential chain, including impression, click, and conversion. Marketing strategies exert varied uplift effects at each stage within this chain, impacting metrics like click-through and conversion rate. Despite its utility, existing research has neglected to consider the inter-task across all stages impacts within a specific treatment and has insufficiently utilized the treatment information, potentially introducing substantial bias into subsequent marketing decisions. We identify these two issues as the chain-bias problem and the treatment-unadaptive problem. This paper introduces the Entire Chain UPlift method with context-enhanced learning (ECUP), devised to tackle these issues. ECUP consists of two primary components: 1) the Entire Chain-Enhanced Network, which utilizes user behavior patterns to estimate ITE throughout the entire chain space, models the various impacts of treatments on each task, and integrates task prior information to enhance context awareness across all stages, capturing the impact of treatment on different tasks, and 2) the Treatment-Enhanced Network, which facilitates fine-grained treatment modeling through bit-level feature interactions, thereby enabling adaptive feature adjustment. Extensive experiments on public and industrial datasets validate ECUPs effectiveness. Moreover, ECUP has been deployed on the Meituan food delivery platform, serving millions of daily active users, with the related dataset released for future research.
- Implementing matching estimators for average treatment effects in Stata. The stata journal 4, 3 (2004), 290–311.
- GENERALIZED RANDOM FORESTS. The Annals of Statistics 47, 2 (2019), 1148–1178.
- Estimating the effects of continuous-valued interventions using generative adversarial networks. Advances in Neural Information Processing Systems 33 (2020), 16434–16445.
- Automated Search for Resource-Efficient Branched Multi-Task Networks. In 31st British Machine Vision Conference 2020, BMVC 2020. BMVA Press, 359.
- Pepnet: Parameter and embedding personalized network for infusing with personalized prior information. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 3795–3804.
- Causalml: Python package for causal machine learning. arXiv preprint arXiv:2002.11631 (2020).
- BART: Bayesian additive regression trees. (2010).
- Abhishek Kumar Hal Daumé III and A Kumar. 2013. Learning task grouping and overlap in multi-task learning. In International Conference on Machine Learning. 1723–1730.
- A large scale benchmark for individual treatment effect prediction and uplift modeling. arXiv preprint arXiv:2111.10106 (2021).
- Nddr-cnn: Layerwise feature fusing in multi-task cnns by neural discriminative dimensionality reduction. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 3205–3214.
- Learning to branch for multi-task learning. In International conference on machine learning. PMLR, 3854–3863.
- Guido W Imbens and Donald B Rubin. 2015. Causal inference in statistics, social, and biomedical sciences. Cambridge University Press.
- Addressing exposure bias in uplift modeling for large-scale online advertising. In 2021 IEEE International Conference on Data Mining (ICDM). IEEE, 1156–1161.
- Estimating treatment effect in the wild via differentiated confounder balancing. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining. 265–274.
- Metalearners for estimating heterogeneous treatment effects using machine learning. Proceedings of the national academy of sciences 116, 10 (2019), 4156–4165.
- Monetary discount strategies for real-time promotion campaign. In Proceedings of the 26th International Conference on World Wide Web. 1123–1132.
- Explicit Feature Interaction-aware Uplift Network for Online Marketing. arXiv preprint arXiv:2306.00315 (2023).
- Causal effect inference with deep latent-variable models. Advances in neural information processing systems 30 (2017).
- Modeling task relationships in multi-task learning with multi-gate mixture-of-experts. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining. 1930–1939.
- Entire space multi-task model: An effective approach for estimating post-click conversion rate. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. 1137–1140.
- Paul R Rosenbaum. 1987. Model-based direct adjustment. Journal of the American statistical Association 82, 398 (1987), 387–394.
- Donald B Rubin. 2005. Causal inference using potential outcomes: Design, modeling, decisions. J. Amer. Statist. Assoc. 100, 469 (2005), 322–331.
- Estimating individual treatment effect: generalization bounds and algorithms. In International conference on machine learning. PMLR, 3076–3085.
- Adapting neural networks for the estimation of treatment effects. Advances in neural information processing systems 32 (2019).
- Elizabeth A Stuart. 2010. Matching methods for causal inference: A review and a look forward. Statistical science: a review journal of the Institute of Mathematical Statistics 25, 1 (2010), 1.
- Multi-task learning for dense prediction tasks: A survey. IEEE transactions on pattern analysis and machine intelligence 44, 7 (2021), 3614–3633.
- Stefan Wager and Susan Athey. 2018. Estimation and inference of heterogeneous treatment effects using random forests. J. Amer. Statist. Assoc. 113, 523 (2018), 1228–1242.
- Enhancing CTR prediction with context-aware feature representation learning. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 343–352.
- Learning discriminative representation base on attention for uplift. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 200–211.
- A survey on causal inference. ACM Transactions on Knowledge Discovery from Data (TKDD) 15, 5 (2021), 1–46.
- GANITE: Estimation of individualized treatment effects using generative adversarial nets. In International conference on learning representations.
- A unified framework for marketing budget allocation. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1820–1830.
- DESCN: Deep Entire Space Cross Networks for Individual Treatment Effect Estimation. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 4612–4620.
- Clustered multi-task learning via alternating structure optimization. Advances in neural information processing systems 24 (2011).