The Sample Complexity of Simple Binary Hypothesis Testing (2403.16981v2)

Published 25 Mar 2024 in math.ST, cs.IT, math.IT, stat.ML, and stat.TH

Abstract: The sample complexity of simple binary hypothesis testing is the smallest number of i.i.d.\ samples required to distinguish between two distributions $p$ and $q$ in either: (i) the prior-free setting, with type-I error at most $\alpha$ and type-II error at most $\beta$; or (ii) the Bayesian setting, with Bayes error at most $\delta$ and prior distribution $(\pi, 1-\pi)$. This problem has only been studied when $\alpha = \beta$ (prior-free) or $\pi = 1/2$ (Bayesian), and the sample complexity is known to be characterized by the Hellinger divergence between $p$ and $q$, up to multiplicative constants. In this paper, we derive a formula that characterizes the sample complexity (up to multiplicative constants that are independent of $p$, $q$, and all error parameters) for: (i) all $0 \le \alpha, \beta \le 1/8$ in the prior-free setting; and (ii) all $\delta \le \pi/4$ in the Bayesian setting. In particular, the formula admits equivalent expressions in terms of certain divergences from the Jensen--Shannon and Hellinger families. The main technical result concerns an $f$-divergence inequality between members of the Jensen--Shannon and Hellinger families, which is proved by a combination of information-theoretic tools and case-by-case analyses. We explore applications of our results to (i) robust hypothesis testing, (ii) distributed (locally-private and communication-constrained) hypothesis testing, (iii) sequential hypothesis testing, and (iv) hypothesis testing with erasures.

References (34)

Citations (3)

View on Semantic Scholar

Summary

The paper introduces a unified sample complexity formula that quantifies the minimum samples needed for binary hypothesis testing in both Bayesian and prior-free settings.
It details a methodology that segments error probability regimes—linear, sublinear, and polynomial—employing metrics like mutual information and f-divergences.
The analysis extends to distributed and robust testing, presenting efficient algorithms under communication constraints and exploring the weak detection regime.

Sample Complexity of Simple Binary Hypothesis Testing

Introduction

The area of statistical hypothesis testing, a cornerstone of statistical inference, primarily focuses on the challenge of deciding between two competing hypotheses based on observed data. The simplest and one of the most fundamental forms of this problem is the simple binary hypothesis testing, which involves distinguishing between two specific distributions, $p$ and $q$ , based on a set of observations. While the classical Neyman-Pearson lemma provides an optimal procedure for this task, understanding the non-asymptotic sample complexity, i.e., the minimum number of samples required to make a decision at a given error rate, presents complex challenges.

Sample Complexity in Bayesian and Prior-free Settings

For a rigorous exploration, we consider both Bayesian and prior-free settings under the assumption that the Hellinger divergence between $p$ and $q$ is no greater than $0.125$. We deliver a comprehensive formula characterizing the sample complexity up to multiplicative constants for:

All $0 \le \alpha, \beta \le 1/8$ in the prior-free setting.
All $\delta \le \alpha/4$ in the Bayesian setting.

This formula, surprisingly, admits equivalent expressions, indicating a versatile application across divergences from both the Jensen-Shannon and Hellinger families.

Results in Bayesian Hypothesis Testing

The Bayesian sample complexity, denoted as $n_B(p,q,\alpha,\delta)$ , specifies the smallest number of i.i.d. samples essential to distinguish between $p$ and $q$ with a Bayes error at most $\delta$ , considering a prior distribution $(\alpha, 1-\alpha)$ . Our analysis segments the sample complexity question into three prime regimes based on the ratio of $\delta$ to $\alpha$ :

For a linear error probability, characterized by $\delta$ being a small but constant fraction of $\alpha$ , the sample complexity is inversely proportional to both the mutual information $I(\Theta;X_1)$ and an $f$ -divergence $H_\lambda(p,q)$ .
When $\delta$ presents a sublinear proportion in $\alpha$ , a reduction-based perspective reveals that solving $B_B(p,q,\alpha,\delta)$ equates to solving multiple instances of $B_B(p,q,\alpha',\delta')$ with modified error probabilities, employing a median-based outcome boost.
The third segment addresses the polynomial error probability regime where $\delta \leq \alpha^2$ , thereby extending known results on asymptotic sample complexity.

Distributed and Robust Hypothesis Testing

Leveraging the derived sample complexity formula, we also investigate hypothesis testing under communication constraints and local differential privacy. We present algorithms exhibiting statistical and computational efficiency, demonstrating the profound impact of our foundational sample complexity analysis on distributed statistical inference paradigms.

Weak Detection Regime

A notable dive into the weak detection regime, where error probabilities approach the priors, unveils that standard divergences like the Hellinger divergence do not fully characterize the sample complexity. The weak detection analysis culminates in intriguing observations about the possible independence of sample complexity from minor perturbations in error probabilities.

Conclusion

The meticulous characterization of sample complexity for simple binary hypothesis testing presents profound theoretical and practical implications, offering insights into the minimal sample requirements across various settings. Our findings unlock new pathways for future explorations in statistical hypothesis testing, encouraging in-depth examinations of weak detection regimes and further application of our formula in distributed hypothesis testing scenarios.

PDF Markdown

Related Papers

Tweets

https://twitter.com/subGaussian/status/1772741898773426454

https://twitter.com/subGaussian/status/1772741923272261802

https://twitter.com/StatMLPapers/status/1772474552800416115

https://twitter.com/Encoding/status/1772720182697345413