Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
175 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Variable Selection with the Knockoffs: Composite Null Hypotheses (2203.02849v4)

Published 6 Mar 2022 in math.ST, eess.SP, stat.ME, stat.ML, and stat.TH

Abstract: The fixed-X knockoff filter is a flexible framework for variable selection with false discovery rate (FDR) control in linear models with arbitrary design matrices (of full column rank) and it allows for finite-sample selective inference via the Lasso estimates. In this paper, we extend the theory of the knockoff procedure to tests with composite null hypotheses, which are usually more relevant to real-world problems. The main technical challenge lies in handling composite nulls in tandem with dependent features from arbitrary designs. We develop two methods for composite inference with the knockoffs, namely, shifted ordinary least-squares (S-OLS) and feature-response product perturbation (FRPP), building on new structural properties of test statistics under composite nulls. We also propose two heuristic variants of S-OLS method that outperform the celebrated Benjamini-Hochberg (BH) procedure for composite nulls, which serves as a heuristic baseline under dependent test statistics. Finally, we analyze the loss in FDR when the original knockoff procedure is naively applied on composite tests.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (25)
  1. Y. Benjamini, Y. Hochberg, Controlling the false discovery rate: a practical and powerful approach to multiple testing, Journal of the royal statistical society. Series B (Methodological) (1995) 289–300.
  2. R. F. Barber, E. J. Candès, Controlling the false discovery rate via knockoffs, The Annals of Statistics 43 (2015) 2055–2085.
  3. Y. Benjamini, D. Yekutieli, The control of the false discovery rate in multiple testing under dependency, The annals of statistics 29 (2001) 1165–1188.
  4. Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach, Journal of the Royal Statistical Society: Series B (Statistical Methodology) 66 (2004) 187–205.
  5. Empirical bayes analysis of a microarray experiment, Journal of the American statistical association 96 (2001) 1151–1160.
  6. R. F. Barber, E. J. Candès, A knockoff filter for high-dimensional selective inference, The Annals of Statistics 47 (2019) 2504–2537.
  7. Panning for gold: ‘model-X’ knockoffs for high dimensional controlled variable selection, Journal of the Royal Statistical Society: Series B (Statistical Methodology) 80 (2018) 551–577.
  8. Robust inference with knockoffs, The Annals of Statistics 48 (2020) 1409–1431.
  9. Deep knockoffs, Journal of the American Statistical Association (2019) 1–12.
  10. KnockoffGAN: Generating knockoffs for feature selection using generative adversarial networks, in: International Conference on Learning Representations, 2018.
  11. DeepPINK: reproducible feature selection in deep neural networks, in: Advances in Neural Information Processing Systems, 2018, pp. 8676–8686.
  12. IPAD: stable interpretable forecasting with knockoffs inference, Journal of the American Statistical Association (2019) 1–13.
  13. M. Pournaderi, Y. Xiang, Differentially private variable selection via the knockoff filter, in: 2021 IEEE 31st International Workshop on Machine Learning for Signal Processing, IEEE, 2021, pp. 1–6.
  14. W. Sun, A. C. McLain, Multiple testing of composite null hypotheses in heteroscedastic models, Journal of the American Statistical Association 107 (2012) 673–687. doi:10.1080/01621459.2012.664505.
  15. T. Dickhaus, Randomized p-values for multiple testing of composite null hypotheses, Journal of Statistical Planning and Inference 143 (2013) 1968–1979. doi:https://doi.org/10.1016/j.jspi.2013.06.011.
  16. S. Cabras, A note on multiple testing for composite null hypotheses, Journal of Statistical Planning and Inference 140 (2010) 659–666. doi:https://doi.org/10.1016/j.jspi.2009.08.010.
  17. G. Blanchard, E. Roquain, Two simple sufficient conditions for fdr control, Electronic Journal of Statistics 2 (2008) 963–992.
  18. S. K. Sarkar, C. Y. Tang, Adjusting the benjamini–hochberg method for controlling the false discovery rate in knockoff-assisted variable selection, Biometrika 109 (2022) 1149–1155.
  19. W. Fithian, L. Lei, Conditional calibration for false discovery rate control under dependence, The Annals of Statistics 50 (2022) 3091–3118.
  20. Improving knockoffs with conditional calibration, arXiv preprint arXiv:2208.09542 (2022).
  21. A. Spector, L. Janson, Powerful knockoffs via minimizing reconstructability, The Annals of Statistics 50 (2022) 252–276.
  22. R. Tibshirani, Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society: Series B (Methodological) 58 (1996) 267–288.
  23. A unified treatment of multiple testing with prior knowledge using the p-filter (2019).
  24. Calibrating noise to sensitivity in private data analysis, in: Theory of cryptography conference, Springer, 2006, pp. 265–284.
  25. The algorithmic foundations of differential privacy., Found. Trends Theor. Comput. Sci. 9 (2014) 211–407.
Citations (1)

Summary

We haven't generated a summary for this paper yet.