Estimating Fold Changes from Partially Observed Outcomes with Applications in Microbial Metagenomics (2402.05231v2)
Abstract: We consider the problem of estimating fold-changes in the expected value of a multivariate outcome observed with unknown sample-specific and category-specific perturbations. This challenge arises in high-throughput sequencing studies of the abundance of microbial taxa because microbes are systematically over- and under-detected relative to their true abundances. Our model admits a partially identifiable estimand, and we establish full identifiability by imposing interpretable parameter constraints. To reduce bias and guarantee the existence of estimators in the presence of sparse observations, we apply an asymptotically negligible and constraint-invariant penalty to our estimating function. We develop a fast coordinate descent algorithm for estimation, and an augmented Lagrangian algorithm for estimation under null hypotheses. We construct a model-robust score test and demonstrate valid inference even for small sample sizes and violated distributional assumptions. The flexibility of the approach and comparisons to related methods are illustrated through a meta-analysis of microbial associations with colorectal cancer.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.