Handling multivariable missing data in causal mediation analysis estimating interventional effects (2403.17396v3)
Abstract: The interventional effects approach to causal mediation analysis is increasingly common in epidemiologic research, given its potential to address policy-relevant questions about hypothetical mediator interventions. Multiple imputation (MI) is widely used for handling missing data in epidemiologic studies. However, guidance is lacking on best practices for using MI when estimating interventional mediation effects, specifically regarding the role of the missingness mechanism in the method's performance, how to appropriately specify the MI model when g-computation is used for effect estimation, and suitable approaches to variance estimation. To address this gap, we conducted simulations based on the Victorian Adolescent Health Cohort Study. We considered seven missingness mechanisms involving varying assumptions about the influence of an intermediate confounder, a mediator, and/or the outcome on missingness in key variables. We compared the performance of complete-case analysis, six MI approaches using fully conditional specification (differing in how the imputation model was tailored), and a "substantive model compatible" multiple imputation-fully conditional specification approach. We evaluated MIBoot (MI, then bootstrap) and BootMI (bootstrap, then MI) approaches for variance estimation. All MI approaches, apart from those clearly diverging from best practice, yielded approximately unbiased estimates when none of the intermediate confounder, mediator, and outcome variables influenced missingness in any of these variables, and showed non-negligible bias otherwise. We observed the largest bias for interventional effects when each of the intermediate confounders, mediators, and outcomes influenced their own missingness. BootMI returned variance estimates with smaller bias than MIBoot.