Sharp variance estimator and causal bootstrap in stratified randomized experiments (2401.16667v3)
Abstract: Randomized experiments are the gold standard for estimating treatment effects, and randomization serves as a reasoned basis for inference. In widely used stratified randomized experiments, randomization-based finite-population asymptotic theory enables valid inference for the average treatment effect, relying on normal approximation and a Neyman-type conservative variance estimator. However, when the sample size is small or the outcomes are skewed, the Neyman-type variance estimator may become overly conservative, and the normal approximation can fail. To address these issues, we propose a sharp variance estimator and two causal bootstrap methods to more accurately approximate the sampling distribution of the weighted difference-in-means estimator in stratified randomized experiments. The first causal bootstrap procedure is based on rank-preserving imputation and we prove its second-order refinement over normal approximation. The second causal bootstrap procedure is based on constant-treatment-effect imputation and is further applicable in paired experiments. In contrast to traditional bootstrap methods, where randomness originates from hypothetical super-population sampling, our analysis for the proposed causal bootstrap is randomization-based, relying solely on the randomness of treatment assignment in randomized experiments. Numerical studies and two real data applications demonstrate advantages of our proposed methods in finite samples. The \texttt{R} package \texttt{CausalBootstrap} implementing our method is publicly available.
- Sampling-based versus design-based uncertainty in regression analysis. Econometrica, 88(1):265–296.
- Sharp bounds on the variance in randomized experiments. Annals of Statistics, 42(3):850–871.
- The econometrics of randomized experiments. In Handbook of Economic Field Experiments, volume 1, pages 73–140. Elsevier.
- Edgeworth expansions for sampling without replacement from finite populations. Journal of Multivariate Analysis, 17(3):261–278.
- When possible, report a fisher-exact p value and display its underlying null randomization distribution. Proceedings of the National Academy of Sciences, 117(32):19151–19158.
- Bobkov, S. G. (2004). Concentration of normalized sums and a central limit theorem for noncorrelated random variables. Annals of Probability, 32(4):2884–2907.
- Efron, B. (1979). Bootstrap methods: Another look at the jackknife. Annals of Statistics, 7(1):1–26.
- Fisher, R. A. (1926). The arrangement of field experiments. Journal of the Ministry of Agriculture, 33:503–513.
- Fogarty, C. B. (2018). Regression-assisted inference for the average treatment effect in paired experiments. Biometrika, 105(4):994–1000.
- Hall, P. (2013). The bootstrap and Edgeworth expansion. Springer Science & Business Media.
- Imai, K. (2008). Variance identification and efficiency analysis in randomized experiments under the matched-pair design. Statistics in Medicine, 27(24):4857–4873.
- Misunderstandings between experimentalists and observationalists about causal inference. Journal of the Royal Statistical Society Series A: Statistics in Society, 171(2):481–502.
- A causal bootstrap. Annals of Statistics, 49(3):1460–1488.
- Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction. New York: Cambridge University Press.
- A “politically robust” experimental design for public policy evaluation, with application to the mexican universal health insurance program. Journal of Policy Analysis and Management, 26(3):479–506.
- General forms of finite population central limit theorems with applications to causal inference. Journal of the American Statistical Association, 112(520):1759–1769.
- Regression-adjusted average treatment effect estimates in stratified randomized experiments. Biometrika, 107(4):935–948.
- Using standard tools from finite population sampling to improve causal inference for complex experiments. Journal of the American Statistical Association, 113(522):868–881.
- Neyman, J. (1990). On the application of probability theory to agricultural experiments. Statistical Science, 5(4):465–472.
- Olken, B. A. (2007). Monitoring corruption: evidence from a field experiment in indonesia. Journal of Political Economy, 115(2):200–249.
- Insights on variance estimation for blocked and matched pairs designs. Journal of Educational and Behavioral Statistics, 46(3):271–296.
- Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66(5):688.
- Rubin, D. B. (1980). Randomization analysis of experimental data: The fisher randomization test comment. Journal of the American Statistical Association, 75(371):591–593.
- Design-based ratio estimators and central limit theorems for clustered, blocked rcts. Journal of the American Statistical Association, 117(540):2135–2146.
- Sharp bounds for variance of treatment effect estimators in the finite population in the presence of covariates. Statistica Sinica, in press.
- Rerandomization in stratified randomized experiments. Journal of the American Statistical Association, 118(542):1295–1304.
- Bootstrap inference for the finite population mean under complex sampling designs. Journal of the Royal Statistical Society Series B: Statistical Methodology, 84(4):1150–1174.
- An edgeworth expansion for symmetric finite population statistics. Annals of Probability, 30(3):1238–1265.
- Lehmann, E. L. (1966). Some concepts of dependence. The Annals of Mathematical Statistics, 37(5):1137–1153.
- Tchen, A. H. (1980). Inequalities for distributions with given marginals. Annals of Probability, 8(4):814–827.
- Design-based theory for lasso adjustment in randomized block experiments with a general blocking scheme. arXiv preprint arXiv:2109.11271.
Sponsored by Paperpile, the PDF & BibTeX manager trusted by top AI labs.
Get 30 days freePaper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.