Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Empirical Analysis of Model Selection for Heterogeneous Causal Effect Estimation (2211.01939v3)

Published 3 Nov 2022 in cs.LG, cs.AI, and stat.ME

Abstract: We study the problem of model selection in causal inference, specifically for conditional average treatment effect (CATE) estimation. Unlike machine learning, there is no perfect analogue of cross-validation for model selection as we do not observe the counterfactual potential outcomes. Towards this, a variety of surrogate metrics have been proposed for CATE model selection that use only observed data. However, we do not have a good understanding regarding their effectiveness due to limited comparisons in prior studies. We conduct an extensive empirical analysis to benchmark the surrogate model selection metrics introduced in the literature, as well as the novel ones introduced in this work. We ensure a fair comparison by tuning the hyperparameters associated with these metrics via AutoML, and provide more detailed trends by incorporating realistic datasets via generative modeling. Our analysis suggests novel model selection strategies based on careful hyperparameter selection of CATE estimators and causal ensembling.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (40)
  1. Ahmed Alaa and Mihaela Van Der Schaar. Validating causal inference models via influence functions. In International Conference on Machine Learning, pp.  191–201. PMLR, 2019.
  2. Policy learning with observational data. arXiv preprint arXiv:1702.02896, 2017.
  3. Estimating treatment effects with causal forests: An application. Observational Studies, 5(2):37–51, 2019.
  4. EconML: A Python Package for ML-Based Heterogeneous Treatment Effects Estimation. https://github.com/microsoft/EconML, 2019. Version 0.x.
  5. Representation learning: A review and new perspectives. IEEE transactions on pattern analysis and machine intelligence, 35(8):1798–1828, 2013.
  6. Double/debiased machine learning for treatment and causal parameters. arXiv preprint arXiv:1608.00060, 2016.
  7. Generic machine learning inference on heterogeneous treatment effects in randomized experiments, with an application to immunization in india. Technical report, National Bureau of Economic Research, 2018.
  8. Alicia Curth and Mihaela Van der Schaar. Nonparametric estimation of heterogeneous treatment effects: From theory to learning algorithms. In International Conference on Artificial Intelligence and Statistics, pp.  1810–1818. PMLR, 2021.
  9. Alicia Curth and Mihaela van der Schaar. In search of insights, not magic bullets: Towards demystification of the model selection dilemma in heterogeneous treatment effect estimation. arXiv preprint arXiv:2302.02923, 2023.
  10. Automated versus do-it-yourself methods for causal inference: Lessons learned from a data analysis competition. Statistical Science, 34(1):43–68, 2019.
  11. Stable discovery of interpretable subgroups via calibration in causal studies. International Statistical Review, 88:S135–S178, 2020.
  12. Orthogonal statistical learning. arXiv preprint arXiv:1901.09036, 2019.
  13. Subgroup identification from randomized clinical trial data. Statistics in medicine, 30(24):2867–2880, 2011.
  14. Ensemble method for estimating individualized treatment effects. arXiv preprint arXiv:2202.12445, 2022.
  15. Deep iv: A flexible approach for counterfactual prediction. In International Conference on Machine Learning, pp.  1414–1423. PMLR, 2017.
  16. Jennifer L Hill. Bayesian nonparametric modeling for causal inference. Journal of Computational and Graphical Statistics, 20(1):217–240, 2011.
  17. Paul W Holland. Statistics and causal inference. Journal of the American statistical Association, 81(396):945–960, 1986.
  18. Neural autoregressive flows. In International Conference on Machine Learning, pp.  2078–2087. PMLR, 2018.
  19. The relative performance of ensemble methods with deep convolutional neural networks for image classification. Journal of Applied Statistics, 45(15):2800–2818, 2018.
  20. Edward H Kennedy. Optimal doubly robust estimation of heterogeneous causal effects. arXiv preprint arXiv:2004.14497, 2020.
  21. Metalearners for estimating heterogeneous treatment effects using machine learning. Proceedings of the national academy of sciences, 116(10):4156–4165, 2019.
  22. Robert J LaLonde. Evaluating the econometric evaluations of training programs with experimental data. The American economic review, pp.  604–620, 1986.
  23. Causal effect inference with deep latent-variable models. Advances in neural information processing systems, 30, 2017.
  24. Realcause: Realistic causal inference benchmarking. arXiv preprint arXiv:2011.15007, 2020.
  25. Quasi-oracle estimation of heterogeneous treatment effects. Biometrika, 108(2):299–319, 2021.
  26. Elements of causal inference: foundations and learning algorithms. The MIT Press, 2017.
  27. Model selection for estimating treatment effects. Journal of the Royal Statistical Society: Series B (StatisticalMethodology), 76:749–769, 2013.
  28. Donald B Rubin. Causal inference using potential outcomes: Design, modeling, decisions. Journal of the American Statistical Association, 100(469):322–331, 2005.
  29. Counterfactual cross-validation: Stable model selection procedure for causal inference models. In International Conference on Machine Learning, pp.  8398–8407. PMLR, 2020.
  30. A comparison of methods for model selection when estimating individual treatment effects. arXiv preprint arXiv:1804.05146, 2018.
  31. Understanding heterogeneity of treatment effects in pragmatic trials. 2012.
  32. Adapting neural networks for the estimation of treatment effects. Advances in neural information processing systems, 32, 2019.
  33. Cab: Continuous adaptive blending for policy evaluation and learning. In International Conference on Machine Learning, pp.  6005–6014. PMLR, 2019.
  34. Quality measures for uplift models. submitted to KDD2011, 2011.
  35. Data-efficient off-policy policy evaluation for reinforcement learning. In International Conference on Machine Learning, pp.  2139–2148. PMLR, 2016.
  36. Estimation and inference of heterogeneous treatment effects using random forests. Journal of the American Statistical Association, 113(523):1228–1242, 2018.
  37. FLAML: A Fast and Lightweight AutoML Library, 2021.
  38. Optimal and adaptive off-policy evaluation in contextual bandits. In International Conference on Machine Learning, pp.  3589–3597. PMLR, 2017.
  39. Estimating heterogeneous treatment effects with observational data. Sociological methodology, 42(1):314–347, 2012.
  40. Uplift modeling with multiple treatments and general response types. In Proceedings of the 2017 SIAM International Conference on Data Mining, pp.  588–596. SIAM, 2017.
Citations (3)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com