Bartlett's Anomaly in Bayesian Testing
- Bartlett's Anomaly is a phenomenon in Bayesian hypothesis testing where an infinitely diffuse prior under the alternative causes the Bayes factor to diverge, automatically favoring the null hypothesis.
- The anomaly arises because as the prior variance increases, the marginal likelihood under the alternative becomes arbitrarily small, rendering the point-null hypothesis dominant regardless of the data.
- This issue highlights the need for careful prior specification or the adoption of interval null hypotheses to avoid pathological inference in Bayesian testing.
Bartlett’s Anomaly refers to a phenomenon in Bayesian hypothesis testing under point nulls, wherein the Bayes factor for the null diverges to infinity as the variance of the conjugate prior on the alternative hypothesis becomes arbitrarily large. This guarantees acceptance of the null hypothesis regardless of the data, highlighting a critical incompatibility between diffuse priors and point-null testing. Although historically conflated with the Jeffreys–Lindley paradox, Bartlett’s Anomaly arises from prior-diffuseness, not sample-size asymptotics, and entails distinct implications and technical resolutions. Its identification has provoked reevaluation of both Bayesian test design and practical recommendations regarding prior specification (Lovric, 28 Nov 2025).
1. Historical Origin and Conceptual Differentiation
Bartlett’s Anomaly was first documented by Michael Bartlett in 1957, in response to an omission in Lindley’s normal-normal Bayes factor for point-null testing. Christian Robert’s 1993 paper cemented the widespread confusion by characterizing the Jeffreys–Lindley paradox as “the fact that a point null hypothesis will always be accepted when the variance of a conjugate prior goes to infinity,” whereas Lovrić later clarified that this is distinct from Lindley’s intended paradox.
The essential distinction is as follows:
| Phenomenon | Limiting Regime | Result |
|---|---|---|
| Bartlett’s Anomaly | (prior var.) | Null favored for any fixed data |
| Jeffreys–Lindley Paradox | (sample size) | Null favored despite fixed Type I error (fixed-) |
This split is not terminological: the analytic structure, practical consequences, and requisite solutions differ fundamentally (Lovric, 28 Nov 2025).
2. Formal Bayesian Model and Derivation
Consider a single observation from the Gaussian model with known variance . Testing proceeds between
Assume equal prior odds, so that the Bayes factor equals the posterior odds. The marginal likelihoods are:
- Under :
- Under :
- Thus, the Bayes factor is:
For ,
3. Mathematical Mechanism of Bartlett’s Anomaly
As the prior variance with fixed and , the Bayes factor behaves as follows:
Consequently, for any observed —regardless of extremity—a sufficiently diffuse prior under forces and posterior probability . This occurs independently of the data and produces an automatic endorsement of the null hypothesis.
4. Comparison to the Jeffreys–Lindley Paradox
In contrast, Lindley’s version fixes the prior variance and examines the asymptotic regime where sample size while controlling the frequentist Type I error (). For at a "just significant" value ,
Both phenomena yield , but Bartlett’s Anomaly is driven by (prior-diffuseness), whereas the Jeffreys–Lindley paradox arises from (sample-size asymptotics). The practical implication of Bartlett’s Anomaly is the pathological favoring of the point null due to an uninformative or infinitely diffuse prior, in contrast to the inherent conflict in the Jeffreys–Lindley scenario between fixed significance testing and point-mass priors in large samples (Lovric, 28 Nov 2025).
5. Consequences for Prior Specification and Hypothesis Testing
Bartlett’s Anomaly demonstrates that in Bayesian point-null testing, improper or overly diffuse priors under render hypothesis comparison meaningless, as the Bayes factor can be manipulated to favor the null trivially:
- Posterior probability “hacking”: by making the prior increasingly diffuse, it is possible to force , claiming maximal posterior support for regardless of the observed data.
- Overly diffuse or improper priors on the alternative effectively neglect the point mass assigned to the null, resulting in a degenerate test.
- Proposed fixes (e.g., Robert, 1993): calibrate prior odds relative to prior width, specifically setting to arrest the divergence. Alternative methods include intrinsic and fractional Bayes factors or nonlocal priors, which penalize diffuse alternatives to prevent spurious support for the null as .
However, such corrections address only the prior-diffuseness anomaly and not the sample-size paradox intrinsic to Jeffreys–Lindley (Lovric, 28 Nov 2025).
6. Resolution via Interval Nulls
The only comprehensive solution advanced is the adoption of interval, rather than point, null hypotheses. Replace with an interval or “region of practical equivalence” for some , and use continuous priors on both and its complement. The Bayes factor generalizes to:
In the “just significant” regime where as , both Bayesian and frequentist procedures converge: equivalence is declared when . Specifically,
- Frequentist TOST: reject if for an appropriate constant .
- Bayesian: declare for , for thresholds such as $3$ or $10$.
Both pathologies—Bartlett’s Anomaly and the Jeffreys–Lindley paradox—are eliminated. The two paradigms then address the same scientifically relevant question and give consistent inferences (Lovric, 28 Nov 2025).
7. Summary and Implications
Bartlett’s Anomaly is the divergence of the Bayes factor in favor of the point null hypothesis, provoked solely by taking the alternative’s prior variance . First identified by Bartlett (1957) and later distinguished from the Jeffreys–Lindley paradox (Lovric, 28 Nov 2025), it exposes the inappropriateness of uninformative or improper priors under point-null testing. While methodological corrections exist, the only definitive resolution is to replace point-null hypotheses with meaningful interval nulls, ensuring statistical analyses are grounded in substantive scientific distinctions and yielding alignment between Bayesian and frequentist decisions.