Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Gaussian Copula Models for Nonignorable Missing Data Using Auxiliary Marginal Quantiles (2406.03463v2)

Published 5 Jun 2024 in stat.ME, stat.AP, and stat.ML

Abstract: We present an approach for modeling and imputation of nonignorable missing data. Our approach uses Bayesian data integration to combine (1) a Gaussian copula model for all study variables and missingness indicators, which allows arbitrary marginal distributions, nonignorable missingess, and other dependencies, and (2) auxiliary information in the form of marginal quantiles for some study variables. We prove that, remarkably, one only needs a small set of accurately-specified quantiles to estimate the copula correlation consistently. The remaining marginal distribution functions are inferred nonparametrically and jointly with the copula parameters using an efficient MCMC algorithm. We also characterize the (additive) nonignorable missingness mechanism implied by the copula model. Simulations confirm the effectiveness of this approach for multivariate imputation with nonignorable missing data. We apply the model to analyze associations between lead exposure and end-of-grade test scores for 170,000 North Carolina students. Lead exposure has nonignorable missingness: children with higher exposure are more likely to be measured. We elicit marginal quantiles for lead exposure using statistics provided by the Centers for Disease Control and Prevention. Multiple imputation inferences under our model support stronger, more adverse associations between lead exposure and educational outcomes relative to complete case and missing-at-random analyses.

Citations (1)

Summary

We haven't generated a summary for this paper yet.