Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
166 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Two-Stage Testing in a high dimensional setting (2406.17466v1)

Published 25 Jun 2024 in stat.ME, math.ST, and stat.TH

Abstract: In a high dimensional regression setting in which the number of variables ($p$) is much larger than the sample size ($n$), the number of possible two-way interactions between the variables is immense. If the number of variables is in the order of one million, which is usually the case in e.g., genetics, the number of two-way interactions is of the order one million squared. In the pursuit of detecting two-way interactions, testing all pairs for interactions one-by-one is computational unfeasible and the multiple testing correction will be severe. In this paper we describe a two-stage testing procedure consisting of a screening and an evaluation stage. It is proven that, under some assumptions, the tests-statistics in the two stages are asymptotically independent. As a result, multiplicity correction in the second stage is only needed for the number of statistical tests that are actually performed in that stage. This increases the power of the testing procedure. Also, since the testing procedure in the first stage is computational simple, the computational burden is lowered. Simulations have been performed for multiple settings and regression models (generalized linear models and Cox PH model) to study the performance of the two-stage testing procedure. The results show type I error control and an increase in power compared to the procedure in which the pairs are tested one-by-one.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com