Causal Customer Churn Analysis with Low-rank Tensor Block Hazard Model (2405.11377v1)
Abstract: This study introduces an innovative method for analyzing the impact of various interventions on customer churn, using the potential outcomes framework. We present a new causal model, the tensorized latent factor block hazard model, which incorporates tensor completion methods for a principled causal analysis of customer churn. A crucial element of our approach is the formulation of a 1-bit tensor completion for the parameter tensor. This captures hidden customer characteristics and temporal elements from churn records, effectively addressing the binary nature of churn data and its time-monotonic trends. Our model also uniquely categorizes interventions by their similar impacts, enhancing the precision and practicality of implementing customer retention strategies. For computational efficiency, we apply a projected gradient descent algorithm combined with spectral clustering. We lay down the theoretical groundwork for our model, including its non-asymptotic properties. The efficacy and superiority of our model are further validated through comprehensive experiments on both simulated and real-world applications.
- Synthetic blip effects: Generalizing synthetic controls for the dynamic treatment regime. arXiv preprint arXiv:2210.11003, 2022.
- Synthetic interventions. arXiv preprint arXiv:2006.07691, 2020.
- Causal matrix completion. arXiv preprint arXiv:2109.15154, 2021.
- Union of low-rank tensor spaces: Clustering and completion. Journal of Machine Learning Research, 21:1–36, 2020.
- Matrix completion methods for causal panel data models. Journal of the American Statistical Association, 116:1–15, 2021.
- Boosting for high-dimensional time-to-event data with competing risks. Bioinformatics, 25:890–896, 2009.
- Predicting customer loyalty using the internal transactional database. Expert systems with applications, 32:125–134, 2007.
- Nonconvex low-rank tensor completion from noisy data. Operations Research, 70:1–19, 2021.
- A max-norm constrained minimization approach to 1-bit matrix completion. Journal of Machine Learning Research, 14:3619–3647, 2013.
- Exact matrix completion via convex optimization. Foundations of Computational Mathematics, 9:717–772, 2009.
- Multisample estimation of bacterial composition matrices in metagenomics data. Biometrika, 107:75–92, 2020.
- Cox-nnet: an artificial neural network method for prognosis prediction of high-throughput omics data. PLoS computational biology, 14:e1006076, 2018.
- Churn prediction in subscription services: An application of support vector machines while comparing two parameter-selection techniques. Expert systems with applications, 34:313–327, 2008.
- 1-bit matrix completion. Information and Inference: A Journal of the IMA, 3:189–223, 2014.
- Tensor completion and low-n-rank tensor recovery via convex optimization. Inverse Problems, 27:025010, 2011.
- Iterative algorithm for discrete structure recovery. The Annals of Statistics, 50:1066–1094, 2022.
- Learning tensors from partial binary measurements. IEEE Transactions on Signal Processing, 67:29–40, 2018.
- Hainmueller, J. Entropy balancing for causal effects: A multivariate reweighting method to produce balanced samples in observational studies. Political Analysis, 20:25–46, 2012.
- Exact clustering in tensor block model: Statistical optimality and computational limit. Journal of the Royal Statistical Society Series B: Statistical Methodology, 84(5):1666–1698, 2022.
- Covariate-assisted sparse tensor completion. Journal of the American Statistical Association, 118:2605–2619, 2023.
- Covariate balancing propensity score. Journal of the Royal Statistical Society Series B: Statistical Methodology, 76:243–263, 2014.
- Random survival forests. The Annals of Applied Statistics, 2:841 – 860, 2008. doi: 10.1214/08-AOAS169. URL https://doi.org/10.1214/08-AOAS169.
- Deepsurv: personalized treatment recommender system using a cox proportional hazards deep neural network. BMC medical research methodology, 18:1–12, 2018.
- Tensor decompositions and applications. SIAM Review, 51:455–500, 2009.
- Set-valued dynamic treatment regimes for competing outcomes. Biometrics, 70:53–61, 2014.
- Investigating the role of product features in preventing customer churn, by using survival analysis and choice modeling: The case of financial services. Expert Systems with Applications, 27:277–285, 2004.
- Probability in Banach Spaces: Isoperimetry and Processes, volume 23. Springer Science & Business Media, 1991.
- Doubly robust estimators for generalizing treatment effects on survival outcomes from randomized controlled trials to a target population. Journal of Causal Inference, 10:415–440, 2022.
- Improving trial generalizability using observational studies. Biometrics, 79:1213–1225, 2023.
- Double/debiased machine learning for dynamic treatment effects. In Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., and Vaughan, J. W. (eds.), Advances in Neural Information Processing Systems, volume 34, pp. 22695–22707, 2021.
- Tensor completion for estimating missing values in visual data. IEEE transactions on pattern analysis and machine intelligence, 35:208–220, 2012.
- Augmented outcome-weighted learning for estimating optimal dynamic treatment regimens. Statistics in Medicine, 37:3776–3788, 2018.
- Lu, J. Predicting customer churn in the telecommunications industry—-an application of survival analysis modeling using sas. SAS User Group International (SUGI27) Online Proceedings, 114:27, 2002.
- A customer churn prediction model in telecom industry using boosting. IEEE Transactions on Industrial Informatics, 10:1659–1665, 2012.
- Tensor bernstein concentration inequalities with an application to sample estimators for high-order moments. Frontiers of Mathematics in China, 15:367–384, 2020.
- Learning individualized treatment rules with many treatments: A supervised clustering approach using adaptive fusion. Advances in Neural Information Processing Systems, 35:15956–15969, 2022.
- Universal latent space model fitting for large networks with edge covariates. Journal of Machine Learning Research, 21:86–152, 2020.
- Weighted tensor completion for time-series causal inference. arXiv preprint arXiv:1902.04646, 2019.
- Matrix completion under complex survey sampling. Annals of the Institute of Statistical Mathematics, 75:463–492, 2023.
- Mixed matrix completion in complex survey sampling under heterogeneous missingness. Journal of Computational and Graphical Statistics, pp. 1–19, 2024.
- Massart, P. About the constants in talagrand’s concentration inequalities for empirical processes. The Annals of Probability, 28:863–884, 2000.
- Temporally-consistent survival analysis. Advances in Neural Information Processing Systems, 35:10671–10683, 2022.
- A novel approach for churn prediction using deep learning. In 2017 IEEE international conference on computational intelligence and computing research (ICCIC), pp. 1–4. IEEE, 2017.
- Robust estimation of causal effects via a high-dimensional covariate balancing propensity score. Biometrika, 107:533–554, 2020.
- Improved doubly robust estimation in learning optimal individualized treatment rules. Journal of the American Statistical Association, 116:283–294, 2021.
- Pearl, J. Causal inference in statistics: An overview. John Wiley & Sons, 2009.
- Performance guarantees for individualized treatment rules. Annals of statistics, 39:1180, 2011.
- Rubin, D. B. Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66:688, 1974.
- Causal analysis of customer churn using deep learning. In 2021 International Conference on Digital Society and Intelligent Systems (DSInS), pp. 319–324. IEEE, 2021.
- Estimation of low-rank tensors via convex optimization. arXiv preprint arXiv:1010.0789, 2010.
- Tsiatis, A. A. Semiparametric theory and missing data. Springer, 2006.
- Tucker, L. R. Some mathematical notes on three-mode factor analysis. Psychometrika, 31:279–311, 1966.
- Automated feature selection and churn prediction using deep learning models. International Research Journal of Engineering and Technology (IRJET), 4:1846–1854, 2017.
- Targeted learning: causal inference for observational and experimental data. Springer Science & Business Media, 2011.
- Learning from binary multiway data: Probabilistic tensor decomposition and its statistical optimality. Journal of Machine Learning Research, 21, 2020.
- On polynomial time methods for exact low rank tensor completion. arXiv preprint arXiv:1702.06980, 2017.
- Statistically optimal and computationally efficient low rank tensor completion from noisy entries. The Annals of Statistics, 49:76–99, 2021.
- Customer churn prediction using improved balanced random forests. Expert Systems with Applications, 36:5445–5449, 2009.
- Yang, S. Semiparametric Estimation of Structural Nested Mean Models with Irregularly Spaced Longitudinal Observations. Biometrics, 78:937–949, 04 2021.
- Propensity score matching and subclassification in observational studies with multi-level treatments. Biometrics, 72:1055–1065, 2016.
- Modeling survival distribution as a function of time to treatment discontinuation: A dynamic treatment regime approach. Biometrics, 74:900–909, 2018.
- Semiparametric estimation of structural failure time models in continuous-time processes. Biometrika, 107:123–136, 2020.
- A useful variant of the davis–kahan theorem for statisticians. Biometrika, 102:315–323, 2015.
- Deep and shallow model for insurance churn prediction service. In 2017 IEEE International Conference on Services Computing (SCC), pp. 346–353. IEEE, 2017.
- Deep neural networks for survival analysis using pseudo values. IEEE journal of biomedical and health informatics, 24:3308–3314, 2020.
- Deep convolutional neural network for survival analysis with pathological images. In 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 544–547. IEEE, 2016.