Papers
Topics
Authors
Recent
Search
2000 character limit reached

Calibrating doubly-robust estimators with unbalanced treatment assignment

Published 3 Mar 2024 in econ.EM and stat.ML | (2403.01585v2)

Abstract: Machine learning methods, particularly the double machine learning (DML) estimator (Chernozhukov et al., 2018), are increasingly popular for the estimation of the average treatment effect (ATE). However, datasets often exhibit unbalanced treatment assignments where only a few observations are treated, leading to unstable propensity score estimations. We propose a simple extension of the DML estimator which undersamples data for propensity score modeling and calibrates scores to match the original distribution. The paper provides theoretical results showing that the estimator retains the DML estimator's asymptotic properties. A simulation study illustrates the finite sample performance of the estimator.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (28)
  1. Athey, S. and G. W. Imbens (2019): “Machine Learning Methods That Economists Should Know About,” Annual Review of Economics, 11, 685–725.
  2. Bach, P., O. Schacht, V. Chernozhukov, S. Klaassen, and M. Spindler (2024): “Hyperparameter Tuning for Causal Inference with Double Machine Learning: A Simulation Study,” Preprint (arXiv:2402.04674).
  3. Belloni, A. and V. Chernozhukov (2013): “Least squares after model selection in high-dimensional sparse models,” Bernoulli, 19, 521–547.
  4. Breiman, L. (2001): “Random forests,” Machine learning, 45, 5–32.
  5. Chen, X. and H. White (1999): “Improved rates and asymptotic normality for nonparametric neural network estimators,” IEEE Transactions on Information Theory, 45, 682–691.
  6. Chernozhukov, V., D. Chetverikov, M. Demirer, E. Duflo, C. Hansen, W. Newey, and J. Robins (2018): “Double/debiased machine learning for treatment and structural parameters,” The Econometrics Journal, 21, C1–C68.
  7. Fan, Q., Y.-C. Hsu, R. P. Lieli, and Y. Zhang (2022): “Estimation of conditional average treatment effects with high-dimensional data,” Journal of Business & Economic Statistics, 40, 313–327.
  8. Farrell, M. H. (2015): “Robust inference on average treatment effects with possibly more covariates than observations,” Journal of Econometrics, 189, 1–23.
  9. Friedman, J. H. (1991): “Multivariate Adaptive Regression Splines,” The Annals of Statistics, 19, 1–67.
  10. Hahn, J. (1998): “On the Role of the Propensity Score in Efficient Semiparametric Estimation of Average Treatment Effects,” Econometrica, 66, 315–331.
  11. He, H. and E. A. Garcia (2009): “Learning from Imbalanced Data,” IEEE Transactions on Knowledge and Data Engineering, 21, 1263–1284.
  12. Huber, M., M. Lechner, and C. Wunsch (2013): “The performance of estimators based on the propensity score,” Journal of Econometrics, 175, 1–21.
  13. Hujer, R., S. L. Thomsen, and C. Zeiss (2006): “The effects of vocational training programmes on the duration of unemployment in Eastern Germany,” AStA Advances in Statistical Analysis, 90, 299–321.
  14. Imbens, G. W. (2004): “Nonparametric Estimation of Average Treatment Effects Under Exogeneity: A Review,” The Review of Economics and Statistics, 86, 4–29.
  15. Japkowicz, N. and S. Stephen (2002): “The class imbalance problem: A systematic study,” Intelligent Data Analysis, 6, 429–449.
  16. Knaus, M. C. (2020): “A Double Machine Learning Approach to Estimate the Effects of Musical Practice on Student’s Skills,” Journal of the Royal Statistical Society Series A: Statistics in Society, 184, 282–300.
  17. Künzel, S. R., J. S. Sekhon, P. J. Bickel, and B. Yu (2019): “Metalearners for estimating heterogeneous treatment effects using machine learning,” Proceedings of the National Academy of Sciences, 116, 4156–4165.
  18. Luo, Y., M. Spindler, and J. Kück (2016): “High-Dimensional L⁢_⁢2𝐿_2L\_2italic_L _ 2 Boosting: Rate of Convergence,” Preprint (arXiv:1602.08927).
  19. Nie, X. and S. Wager (2020): “Quasi-oracle estimation of heterogeneous treatment effects,” Biometrika, 108, 299–319.
  20. Pedregosa, F., G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay (2011): “Scikit-learn: Machine Learning in Python,” Journal of Machine Learning Research, 12, 2825–2830.
  21. Pozzolo, A. D., O. Caelen, R. A. Johnson, and G. Bontempi (2015): “Calibrating Probability with Undersampling for Unbalanced Classification,” in 2015 IEEE Symposium Series on Computational Intelligence, 159–166.
  22. Rosenbaum, P. R. and D. B. Rubin (1983): “The central role of the propensity score in observational studies for causal effects,” Biometrika, 70, 41–55.
  23. Rubin, D. (1972): “Estimating causal effects of treatments in randomized and nonrandomized studies,” Journal of Educational Psychology, 66, 688–701.
  24. Słoczyński, T. and J. M. Wooldridge (2018): “A general double robustness result for estimating average treatment effects,” Econometric Theory, 34, 112–133.
  25. Wager, S. (2022): “STATS 361: Causal Inference,” Lecture Notes.
  26. Wager, S. and G. Walther (2016): “Adaptive concentration of regression trees, with application to random forests,” Preprint (arXiv:1503.06388).
  27. Yao, L., Z. Chu, S. Li, Y. Li, J. Gao, and A. Zhang (2021): “A Survey on Causal Inference,” ACM Trans. Knowl. Discov. Data, 15.
  28. Zimmert, M. and M. Lechner (2019): “Nonparametric estimation of causal heterogeneity under high-dimensional confounding,” Preprint (arXiv:1908.08779).
Citations (1)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (1)

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.