Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
121 tokens/sec
GPT-4o
9 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Double Machine Learning Approach to Combining Experimental and Observational Data (2307.01449v2)

Published 4 Jul 2023 in stat.ME, cs.AI, cs.LG, and econ.EM

Abstract: Experimental and observational studies often lack validity due to untestable assumptions. We propose a double machine learning approach to combine experimental and observational studies, allowing practitioners to test for assumption violations and estimate treatment effects consistently. Our framework tests for violations of external validity and ignorability under milder assumptions. When only one of these assumptions is violated, we provide semiparametrically efficient treatment effect estimators. However, our no-free-lunch theorem highlights the necessity of accurately identifying the violated assumption for consistent treatment effect estimation. Through comparative analyses, we show our framework's superiority over existing data fusion methods. The practical utility of our approach is further exemplified by three real-world case studies, underscoring its potential for widespread application in empirical research.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (44)
  1. Tennessee’s Student Teacher Achievement Ratio (STAR) project.
  2. Combining experimental and observational data to estimate treatment effects on long term outcomes.
  3. Blackwell, M. (2014). A selection bias approach to sensitivity analysis for causal effects. Political Analysis, 22(2):169–182.
  4. Methods for integrating trials and non-experimental data to examine treatment effect heterogeneity. arXiv preprint arXiv:2302.13428.
  5. CASS Investigators (1983). Coronary artery surgery study (cass): a randomized trial of coronary artery bypass surgery. survival data. Circulation, 68(5):939–950.
  6. Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal, 21(1):C1–C68.
  7. Causal inference methods for combining randomized trials and observational studies: a review. arXiv preprint arXiv:2011.08047.
  8. Extending inferences from a randomized trial to a new target population. Statistics in medicine, 39(14):1999–2014.
  9. Sensitivity analysis using bias functions for studies extending inferences from a randomized trial to a target population.
  10. A cross-validated targeted maximum likelihood estimator for data-adaptive experiment selection applied to the augmentation of rct control arms with external data.
  11. Testing for the unconfoundedness assumption using an instrumental assumption. Journal of Causal Inference, 2(2):187–199.
  12. Propensity Score-Matching Methods for Nonexperimental Causal Studies. Review of Economics and Statistics, 84(1).
  13. Causal Effects in Nonexperimental Studies: Reevaluating the Evaluation of Training Programs. JASA, 94(448).
  14. Sensitivity analysis without assumptions. Epidemiology, 27(3):368–377.
  15. Elements of external validity: Framework, design, and analysis.
  16. Farrell, M. H. (2015). Robust inference on average treatment effects with possibly more covariates than observations. Journal of Econometrics, 189(1):1–23.
  17. Emulating randomized clinical trials with nonrandomized real-world evidence studies: First results from the RCT DUPLICATE initiative. Circulation, 143(10):1002–1013.
  18. Doubly Robust Estimation of Causal Effects. American Journal of Epidemiology, 173(7):761.
  19. Combining experimental and observational studies in meta-analysis: A mutual debiasing approach.
  20. Combining experimental and observational data for identification and estimation of long-term causal effects.
  21. An introduction to the augmented inverse propensity weighted estimator. Political analysis, 18(1):36–56.
  22. From sample average treatment effect to population average treatment effect on the treated: combining experimental with observational studies to estimate population treatment effects. Journal of the Royal Statistical Society: Series A, 178(3):757–778.
  23. Leveraging population outcomes to improve the generalization of experimental results.
  24. Removing hidden confounding by experimental grounding. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, NeurIPS’18, page 10911–10920, Red Hook, NY, USA. Curran Associates Inc.
  25. LaLonde, R. J. (1986). Evaluating Econometric Evaluations of Training Programs with Experimental Data. The American Economic Review, pages 604–620.
  26. Many data: Combine experimental and observational data through a power likelihood.
  27. Negative controls: a tool for detecting confounding and bias in observational studies. Epidemiology (Cambridge, Mass.), 21(3):383.
  28. An introduction to sensitivity analysis for unobserved confounding in nonexperimental prevention research. Epidemiology, 27(3):368–377.
  29. Causal inference for comprehensive cohort studies. arXiv preprint arXiv:1910.03531.
  30. Inference on breakdown frontiers. Quantitative Economics, 11(1):41–111.
  31. Mosteller, F. (2014). The Tennessee study of class size in the early school grades. Princeton University Press.
  32. Analysis of randomized and nonrandomized patients in clinical trials using the comprehensive cohort follow-up study design. Controlled clinical trials, 13(3):226–239.
  33. Malts: Matching after learning to stretch. Journal of Machine Learning Research, 23(240):1–42.
  34. Combining observational and experimental data to find heterogeneous treatment effects.
  35. Estimation of regression coefficients when some regressors are not always observed. Journal of the American Statistical Association, 89(427):846–866.
  36. Estimation of regression coefficients when some regressors are not always observed. Journal of the American statistical Association, 89(427):846–866.
  37. Rosenbaum, P. R. (1995). Observational Studies, chapter 4. Springer.
  38. Assessing sensitivity to an unobserved binary covariate in an observational study with binary outcome. Journal of the Royal Statistical Society: Series B (Methodological), 45(2):212–218.
  39. Combining observational and experimental datasets using shrinkage estimators.
  40. Propensity score methods for merging observational and experimental datasets.
  41. Designing experiments informed by observational studies. Journal of Causal Inference, 9(1):147–171.
  42. An introduction to proximal causal learning. arXiv e-prints, pages arXiv–2009.
  43. Learning adjustment sets from observational and limited experimental data. Proceedings of the AAAI Conference on Artificial Intelligence, 35(11):9940–9948.
  44. Integrative r𝑟ritalic_r-learner of heterogeneous treatment effects combining experimental and observational studies. In Conference on Causal Learning and Reasoning, pages 904–926. PMLR.
Citations (7)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com