Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
120 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Double Machine Learning at Scale to Predict Causal Impact of Customer Actions (2409.02332v1)

Published 3 Sep 2024 in cs.LG, econ.EM, stat.AP, and stat.ME

Abstract: Causal Impact (CI) of customer actions are broadly used across the industry to inform both short- and long-term investment decisions of various types. In this paper, we apply the double machine learning (DML) methodology to estimate the CI values across 100s of customer actions of business interest and 100s of millions of customers. We operationalize DML through a causal ML library based on Spark with a flexible, JSON-driven model configuration approach to estimate CI at scale (i.e., across hundred of actions and millions of customers). We outline the DML methodology and implementation, and associated benefits over the traditional potential outcomes based CI model. We show population-level as well as customer-level CI values along with confidence intervals. The validation metrics show a 2.2% gain over the baseline methods and a 2.5X gain in the computational time. Our contribution is to advance the scalable application of CI, while also providing an interface that allows faster experimentation, cross-platform support, ability to onboard new use cases, and improves accessibility of underlying code for partner teams.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (7)
  1. Sekhon, Jasjeet. The Neyman–Rubin Model of Causal Inference and Estimation via Matching Methods. 2007. The Oxford Handbook of Political Methodology.
  2. Holland, Paul W. Statistics and Causal Inference. 1986. J. Amer. Statist. Assoc. 81 (396): 945–960. doi:10.1080/01621459.1986.10478354
  3. Rubin, Donald. Causal Inference Using Potential Outcomes. 2005. J. Amer. Statist. Assoc. 81 (396): 945–960. doi:10.1080/01621459.1986.10478354
  4. Huber, Peter J. The behavior of maximum likelihood estimates under nonstandard conditions. (1967) Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability. Vol. 5. pp. 221–233
  5. White, Halbert. A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity. Econometrica 48 (4): 817–838
  6. Edward H. Kennedy, Optimal doubly robust estimation of heterogeneous causal effects (2020). ArXiv:2004.14497 [math.ST]
  7. Estimating exposure effects by modelling the expectation of exposure conditional on confounders. Biometrics 48 479–495. MR1173493
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com