Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Calibrating doubly-robust estimators with unbalanced treatment assignment (2403.01585v2)

Published 3 Mar 2024 in econ.EM and stat.ML

Abstract: Machine learning methods, particularly the double machine learning (DML) estimator (Chernozhukov et al., 2018), are increasingly popular for the estimation of the average treatment effect (ATE). However, datasets often exhibit unbalanced treatment assignments where only a few observations are treated, leading to unstable propensity score estimations. We propose a simple extension of the DML estimator which undersamples data for propensity score modeling and calibrates scores to match the original distribution. The paper provides theoretical results showing that the estimator retains the DML estimator's asymptotic properties. A simulation study illustrates the finite sample performance of the estimator.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (28)
  1. Athey, S. and G. W. Imbens (2019): “Machine Learning Methods That Economists Should Know About,” Annual Review of Economics, 11, 685–725.
  2. Bach, P., O. Schacht, V. Chernozhukov, S. Klaassen, and M. Spindler (2024): “Hyperparameter Tuning for Causal Inference with Double Machine Learning: A Simulation Study,” Preprint (arXiv:2402.04674).
  3. Belloni, A. and V. Chernozhukov (2013): “Least squares after model selection in high-dimensional sparse models,” Bernoulli, 19, 521–547.
  4. Breiman, L. (2001): “Random forests,” Machine learning, 45, 5–32.
  5. Chen, X. and H. White (1999): “Improved rates and asymptotic normality for nonparametric neural network estimators,” IEEE Transactions on Information Theory, 45, 682–691.
  6. Chernozhukov, V., D. Chetverikov, M. Demirer, E. Duflo, C. Hansen, W. Newey, and J. Robins (2018): “Double/debiased machine learning for treatment and structural parameters,” The Econometrics Journal, 21, C1–C68.
  7. Fan, Q., Y.-C. Hsu, R. P. Lieli, and Y. Zhang (2022): “Estimation of conditional average treatment effects with high-dimensional data,” Journal of Business & Economic Statistics, 40, 313–327.
  8. Farrell, M. H. (2015): “Robust inference on average treatment effects with possibly more covariates than observations,” Journal of Econometrics, 189, 1–23.
  9. Friedman, J. H. (1991): “Multivariate Adaptive Regression Splines,” The Annals of Statistics, 19, 1–67.
  10. Hahn, J. (1998): “On the Role of the Propensity Score in Efficient Semiparametric Estimation of Average Treatment Effects,” Econometrica, 66, 315–331.
  11. He, H. and E. A. Garcia (2009): “Learning from Imbalanced Data,” IEEE Transactions on Knowledge and Data Engineering, 21, 1263–1284.
  12. Huber, M., M. Lechner, and C. Wunsch (2013): “The performance of estimators based on the propensity score,” Journal of Econometrics, 175, 1–21.
  13. Hujer, R., S. L. Thomsen, and C. Zeiss (2006): “The effects of vocational training programmes on the duration of unemployment in Eastern Germany,” AStA Advances in Statistical Analysis, 90, 299–321.
  14. Imbens, G. W. (2004): “Nonparametric Estimation of Average Treatment Effects Under Exogeneity: A Review,” The Review of Economics and Statistics, 86, 4–29.
  15. Japkowicz, N. and S. Stephen (2002): “The class imbalance problem: A systematic study,” Intelligent Data Analysis, 6, 429–449.
  16. Knaus, M. C. (2020): “A Double Machine Learning Approach to Estimate the Effects of Musical Practice on Student’s Skills,” Journal of the Royal Statistical Society Series A: Statistics in Society, 184, 282–300.
  17. Künzel, S. R., J. S. Sekhon, P. J. Bickel, and B. Yu (2019): “Metalearners for estimating heterogeneous treatment effects using machine learning,” Proceedings of the National Academy of Sciences, 116, 4156–4165.
  18. Luo, Y., M. Spindler, and J. Kück (2016): “High-Dimensional L⁢_⁢2𝐿_2L\_2italic_L _ 2 Boosting: Rate of Convergence,” Preprint (arXiv:1602.08927).
  19. Nie, X. and S. Wager (2020): “Quasi-oracle estimation of heterogeneous treatment effects,” Biometrika, 108, 299–319.
  20. Pedregosa, F., G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay (2011): “Scikit-learn: Machine Learning in Python,” Journal of Machine Learning Research, 12, 2825–2830.
  21. Pozzolo, A. D., O. Caelen, R. A. Johnson, and G. Bontempi (2015): “Calibrating Probability with Undersampling for Unbalanced Classification,” in 2015 IEEE Symposium Series on Computational Intelligence, 159–166.
  22. Rosenbaum, P. R. and D. B. Rubin (1983): “The central role of the propensity score in observational studies for causal effects,” Biometrika, 70, 41–55.
  23. Rubin, D. (1972): “Estimating causal effects of treatments in randomized and nonrandomized studies,” Journal of Educational Psychology, 66, 688–701.
  24. Słoczyński, T. and J. M. Wooldridge (2018): “A general double robustness result for estimating average treatment effects,” Econometric Theory, 34, 112–133.
  25. Wager, S. (2022): “STATS 361: Causal Inference,” Lecture Notes.
  26. Wager, S. and G. Walther (2016): “Adaptive concentration of regression trees, with application to random forests,” Preprint (arXiv:1503.06388).
  27. Yao, L., Z. Chu, S. Li, Y. Li, J. Gao, and A. Zhang (2021): “A Survey on Causal Inference,” ACM Trans. Knowl. Discov. Data, 15.
  28. Zimmert, M. and M. Lechner (2019): “Nonparametric estimation of causal heterogeneity under high-dimensional confounding,” Preprint (arXiv:1908.08779).
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com