A generalized e-value feature detection method with FDR control at multiple resolutions

Published 25 Sep 2024 in stat.ME | (2409.17039v3)

Abstract: Multiple resolutions occur in a range number of explanatory features due to existence of domain-specific structure, which results in groups for the features. Within this context, the simultaneous detection of significant features and groups aimed at a specific response with false discovery rate (FDR) control stands as a crucial issue, such as the spatial genome-wide association studies. Existing methods typically require maintaining the same detection approach at different resolutions to achieve multilayer FDR control, which may be not efficient. For instance, it is unsuitable to apply knockoff method to detect features with high correlations, therefore, the efficiency of multilayer knockoff filter (MKF) is also not guaranteed. To tackle this problem, we introduce a novel method of derandomized flexible e-filter procedure (DFEFP) by developing generalized e-values. This method utilizes a wide variety of base detection procedures that operate effectively across various resolutions to provide stable and consistent results, while controlling the false discovery rate at multiple resolutions simultaneously. Furthermore, we investigate the statistical properties of the DFEFP, encompassing multilayer FDR control, stability guarantee, and solution correctness of algorithm. The DFEFP is initially exemplified to construct an e-value data splitting filter (eDS-filter). Subsequently, the eDS-filter in combination with the group knockoff filter (gKF) is used to develop more flexible methodology which referred to as the eDS+gKF-filter. Simulation studies demonstrate that the eDS+gKF-filter effectively controls FDR at multiple resolutions while either maintaining or enhancing power compared to MKF. The superiority of the eDS+gKF-filter is also demonstrated through the analysis of HIV mutation data.