Papers
Topics
Authors
Recent
Search
2000 character limit reached

Hypothesis testing with e-values

Published 31 Oct 2024 in math.ST, stat.ME, and stat.TH | (2410.23614v4)

Abstract: This book is written to offer a humble, but unified, treatment of e-values in hypothesis testing. It is organized into three parts: Fundamental Concepts, Core Ideas, and Advanced Topics. The first part includes four chapters that introduce the basic concepts. The second part includes five chapters of core ideas such as universal inference, log-optimality, e-processes, operations on e-values, and e-values in multiple testing. The third part contains seven chapters of advanced topics. The book collates important results from a variety of modern papers on e-values and related concepts, and also contains many results not published elsewhere. It offers a coherent and comprehensive picture on a fast-growing research area, and is ready to use as the basis of a graduate course in statistics and related fields.

Citations (2)

Summary

  • The paper introduces e-values as a novel alternative to p-values, redefining evidence under the null hypothesis.
  • It establishes calibrators that convert e-values to p-values and demonstrates log-optimal e-variables in simple testing scenarios.
  • The study details robust methods for sequential inference and multiple testing, enhancing statistical validity in complex data settings.

Hypothesis Testing with E-values: An Overview

The paper, "Hypothesis Testing with E-values" by Aaditya Ramdas and Ruodu Wang, presents a comprehensive treatment of e-values, an innovative and promising alternative to traditional p-values, in the context of hypothesis testing. Here, we explore the key contributions and implications of this work, particularly emphasizing its potential impact on both theoretical and practical advances in statistical hypothesis testing and data science.

E-Values: Concept and Significance

E-values are posited as a robust alternative to p-values, central to traditional hypothesis testing frameworks. The paper defines an e-variable as a nonnegative random variable with an expected value less than or equal to one under the null hypothesis. This offers a reinterpretation of evidence against the null hypothesis: larger e-values strengthen evidence against it. The fundamental property—subadditivity across tests and trials—implies an inherent robustness to multiple testing, presenting a strong argument for their utility across various statistical domains.

Key Theoretical Contributions

One of the central themes in the paper is the establishment of conditions under which e-values can be converted to p-values and vice versa, via calibrators. The authors meticulously outline necessary and sufficient conditions for admissible calibrators and explore the implications of these conversions.

The paper further advances the theoretical underpinning of e-values by discussing their relationship with likelihood ratios, in particular when hypotheses are simple. The authors show that for simple hypothesis settings, the likelihood ratio constitutes a log-optimal e-variable, achieving the maximum e-power, defined as the growth of expected logarithm under the alternative. This offers a pathway to leveraging e-values in classical statistical hypothesis frameworks while retaining their interpretative advantages.

Universal Inference and Composite Nulls

Addressing composite null hypotheses, the authors introduce universal inference techniques facilitated by e-values, which stand robust under minimal assumptions. The paper presents split and subsampled likelihood ratio approaches, providing a practical method to construct e-values even when facing the complexity of irregular statistical models or high-dimensional data paradigms. These developments greatly expand the applicability of e-values to real-world data analysis challenges where traditional asymptotic approximations fail or become unreliable.

Sequential Inference and E-Process Constructs

The extension of e-values into the sequential domain, termed e-processes, is another significant contribution of this work. This generalization allows for inference procedures that remain valid at any stopping time, greatly enhancing their utility in settings like clinical trials, where data is collected sequentially. The concept of testing by betting is effectively leveraged here—placing e-processes at the intersection of probability, statistics, and information theory.

Ville's inequality provides a pivotal theoretical framework supporting these concepts, ensuring that e-processes maintain rigorous statistical validity while enabling greater experimental flexibility and adaptability.

Multiple Testing and E-BH Procedure

The development and validation of the e-BH procedure for multiple hypothesis testing, a counterpart to the Benjamini-Hochberg procedure for p-values, underscore the utility of e-values beyond single hypothesis contexts. The paper demonstrates that the e-BH procedure controls the false discovery rate (FDR) under arbitrary dependence, presenting a robust alternative that sidesteps the dependency assumptions often required by p-value-based procedures. Theoretical guarantees accompanying the e-BH procedure further solidify its place in the multiple testing landscape.

Practical Implications and Future Prospects

The introduction of e-values and various methodologies like universal inference, compound e-values, and e-processes offer a transformative effect on statistical practice, particularly in data-rich scenarios and complex data analytics where traditional methods may not suffice. Their robustness to dependence structures, applicability in sequential systems, and interpretative transparency can potentially reshape several paradigms in statistical hypothesis testing, ranging from bioinformatics and machine learning to econometrics and cognitive science.

Conclusion

Ramdas and Wang's exposition on e-values in hypothesis testing presents a unified and novel statistical framework that promises both theoretical depth and practical versatility. The adoption of e-values and the associated methodologies offers a substantial enhancement to the tools available for statisticians and data scientists, empowering robust decision-making amidst uncertainty while paving the way for further exploration into non-trivial statistical structures and phenomena.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (2)

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 11 tweets with 862 likes about this paper.

HackerNews

  1. Hypothesis Testing with E-Values (3 points, 0 comments)