Papers
Topics
Authors
Recent
Search
2000 character limit reached

Balancing Weights for Non-monotone Missing Data

Published 14 Feb 2024 in stat.ME | (2402.08873v2)

Abstract: Balancing weights have been widely applied to single or monotone missingness due to empirical advantages over likelihood-based methods and inverse probability weighting approaches. This paper considers non-monotone missing data under the complete-case missing variable condition (CCMV), a case of missing not at random (MNAR). Using relationships between each missing pattern and the complete-case subsample, we construct a weighted estimator for estimation, where the weight is a sum of ratios of the conditional probability of observing a particular missing pattern versus that of observing the complete-case, given the variables observed in the corresponding missing pattern. However, plug-in estimators of the propensity odds can be unbounded and lead to unstable estimation. Using further relations between propensity odds and balancing of moments across response patterns, we employ tailored loss functions, each encouraging empirical balance across patterns to estimate propensity odds flexibly using a functional basis expansion. We propose two penalizations to control propensity odds model smoothness and empirical imbalance. We study the asymptotic properties of the proposed estimators and show that they are consistent under mild smoothness assumptions. Asymptotic normality and efficiency are developed. Simulation results show the superior performance of the proposed method.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (35)
  1. Barengolts, E. (2001). Risk factors for hip fracture in predominantly african-american veteran male population. J Bone Miner Res 16, S170.
  2. Efficient and adaptive estimation for semiparametric models, Volume 4. Springer.
  3. Globally efficient non-parametric inference of average treatment effects by empirical balancing calibration weighting. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 78(3), 673–700.
  4. A pseudoscore estimator for regression problems with two-phase sampling. Journal of the American Statistical Association 98(461), 158–168.
  5. Inference in semiparametric regression models under partial questionnaire design and nonmonotone missing data. Journal of the American Statistical Association 105(490), 787–797.
  6. Chen, H. Y. (2004). Nonparametric and semiparametric models for missing covariates in parametric regression. Journal of the American Statistical Association 99(468), 1176–1189.
  7. Chen, X. (2007). Large sample sieve estimation of semi-nonparametric models. Handbook of econometrics 6, 5549–5632.
  8. Semiparametric efficiency in gmm models with auxiliary data. The Annals of Statistics 36(2), 808–843.
  9. Chen, Y.-C. (2022). Pattern graphs: a graphical approach to nonmonotone missing data. The Annals of Statistics 50(1), 129–146.
  10. Improving covariate balancing propensity score: A doubly robust and efficient approach. URL: https://imai. fas. harvard. edu/research/CBPStheory. html.
  11. Nonparametric estimation of an additive model with a link function. The Annals of Statistics 32(6), 2412 – 2443.
  12. Covariate balancing propensity score. Journal of the Royal Statistical Society Series B: Statistical Methodology 76(1), 243–263.
  13. Kang, J. D. and J. L. Schafer (2007). Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data. Statistical Science 22(4), 523–539.
  14. Little, R. J. (1993). Pattern-mixture models for multivariate incomplete data. Journal of the American Statistical Association 88(421), 125–134.
  15. Little, R. J. and D. B. Rubin (2019). Statistical analysis with missing data, Volume 793. John Wiley & Sons.
  16. Little, R. J. and M. D. Schluchter (1985). Maximum likelihood estimation for mixed continuous and categorical data with missing values. Biometrika 72(3), 497–512.
  17. Semiparametric inference for nonmonotone missing-not-at-random data: the no self-censoring model. Journal of the American Statistical Association 117(539), 1415–1423.
  18. Newey, W. K. (1990). Semiparametric efficiency bounds. Journal of applied econometrics 5(2), 99–135.
  19. Newey, W. K. (1997). Convergence rates and asymptotic normality for series estimators. Journal of econometrics 79(1), 147–168.
  20. Reilly, M. and M. S. Pepe (1995). A mean score method for missing and auxiliary covariate data in regression models. Biometrika 82(2), 299–314.
  21. Robins, J. M. and R. D. Gill (1997). Non-response models for the analysis of non-monotone ignorable missing data. Statistics in medicine 16(1), 39–56.
  22. Estimation of regression coefficients when some regressors are not always observed. Journal of the American statistical Association 89(427), 846–866.
  23. Rubin, D. B. (1976). Inference and missing data. Biometrika 63(3), 581–592.
  24. Sadinle, M. and J. P. Reiter (2017). Itemwise conditionally independent nonresponse modelling for incomplete multivariate data. Biometrika 104(1), 207–220.
  25. Semiparametric approach for non-monotone missing covariates in a parametric regression model. Biometrics 70(2), 299–311.
  26. Tan, Z. (2020). Regularized calibrated estimation of propensity scores with model misspecification and high-dimensional data. Biometrika 107(1), 137–158.
  27. Discrete choice models for nonmonotone nonignorable missing data: Identification and inference. Statistica Sinica 28(4), 2069.
  28. Van der Vaart, A. W. (2000a). Asymptotic statistics, Volume 3. Cambridge university press.
  29. Van der Vaart, A. W. (2000b). Asymptotic statistics, Volume 3. Cambridge university press.
  30. Wellner, J. et al. (2013). Weak convergence and empirical processes: with applications to statistics. Springer Science & Business Media.
  31. Wong, R. K. and K. C. G. Chan (2018). Kernel-based covariate functional balancing for observational studies. Biometrika 105(1), 199–213.
  32. Wood, S. N. (2017). Generalized additive models: an introduction with R. CRC press.
  33. Covariate association eliminating weights: a unified weighting framework for causal effect estimation. Biometrika 105(3), 709–722.
  34. Zhao, Q. (2019). Covariate balancing propensity score by tailored loss functions. The Annals of Statistics 47(2), 965–993.
  35. Zubizarreta, J. R. (2015). Stable weights that balance covariates for estimation with incomplete outcome data. Journal of the American Statistical Association 110(511), 910–922.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.