Experimental Unit Information Index (EUII)
- Experimental Unit Information Index (EUII) is a metric that quantifies evidentiary value per experimental unit by normalizing the diagnostic odds ratio in hypothesis testing.
- It applies to both fixed-sample and adaptive designs, offering interpretation from frequentist and Bayesian perspectives to optimize study parameters.
- Numerical and asymptotic analyses reveal that increasing power and employing early-stopping rules enhance EUII, leading to more efficient and ethically sound experiments.
The Experimental Unit Information Index (EUII) quantifies the evidentiary value contributed by a single experimental unit in the context of hypothesis testing. Designed to enable rigorous trade-offs between statistical power, Type I error, and sample size, the EUII provides a single, unit-normalized metric that characterizes the per-unit accumulation of evidentiary value in both fixed-sample and adaptive designs. It is interpretable from both frequentist and Bayesian perspectives and offers guidance for optimizing study designs, particularly in fields such as animal research where reduction in experimental units is ethically mandated (Held et al., 21 Nov 2025).
1. Definition in Fixed-Sample Designs
The EUII for a fixed-sample design derives directly from the diagnostic odds ratio (DOR), which combines the likelihood ratios for significant and non-significant outcomes under both null () and alternative () hypotheses. For a test with per-sample Type I error and power , define:
- Positive likelihood ratio:
- Negative likelihood ratio:
The diagnostic odds ratio:
If independent units are used, each unit is attributed with the th root of the DOR:
A test is considered evidentially useful if . This metric is agnostic to the specific values of , , or and is readily computed using design specifications.
2. Asymptotic Properties
The behavior of EUII as elucidates its theoretical bounds. For a one-sided, one-sample -test with effect size :
where is the standard normal CDF. Setting yields:
Since is asymptotically constant, converges as:
For a two-sample -test of mean difference , the corresponding limit is:
This establishes that the per-unit informational gain exhibits asymptotic saturation, reflecting diminishing per-unit returns as sample size increases.
3. Interpretations: Frequentist and Bayesian Perspectives
Frequentist Interpretation: EUII represents the geometric mean increase in odds of a significant result under compared to :
Exponentiation by $1/n$ interprets EUII as the per-unit multiplicative increase in evidentiary odds.
Bayesian Interpretation: By Bayes’ theorem, observing a significant (or non-significant) result modifies the posterior odds for :
- Significant:
- Non-significant:
The DOR quantifies the ratio of posterior odds between significant and non-significant outcomes. Therefore, is the per-unit geometric average change in Bayes-factor-equivalent posterior odds distinguishing significant from non-significant results.
4. Extension to Adaptive and Group-Sequential Designs
In adaptive or group-sequential studies, the sample size is a random variable contingent on interim stopping for efficacy or futility, differing under and . Define:
- : Expected sample size when stopping with significance (“sig”)
- : Expected sample size when stopping with non-significance (“nonsig”)
The generalized EUII becomes:
Variability in can be accommodated by a second-order Taylor expansion:
where and similarly for .
Analytic or simulation-based estimation of enables concrete EUII calculation for varied adaptive designs, allowing precise evaluation of how early-stopping rules impact per-unit evidentiary value.
5. Numerical Examples
A two-arm, fixed-sample design with , power $0.80$, and effect size :
- ,
Each unit increases DOR by approximately .
A constant-bound Pocock group-sequential design (four looks), with , , and :
- ,
Early stopping confers an additional per-unit evidentiary value of over the fixed-sample design.
6. Implications for Design Optimization
Maximizing EUII entails:
- Recognizing that for a fixed effect size , the asymptotic bound (or for two-sample tests) is unattainable by further increasing ; per-unit information exhibits diminishing returns.
- Tuning at finite trades off between and . Standard choices for are typically near-optimal.
- Lower (higher power) always increases and EUII but necessitates larger .
- Maximizing power for fixed yields the uniformly most powerful test and, hence, the highest EUII.
- Adaptive designs, notably those with effective early-stopping rules for efficacy or futility, reduce or , increasing EUII substantially. Two to four well-selected interim analyses and application of predictive-power futility boundaries (e.g., stop if predictive power –$0.3$) capture a significant fraction of possible evidentiary gains.
- Unbalanced randomization reduces power at fixed and thus modestly lowers EUII; equal allocation optimizes EUII if this metric is prioritized.
In summary, the EUII provides a rigorous, unified measure of evidence efficiency per experimental unit in both frequentist and Bayesian contexts. It is maximized by adopting most powerful critical values at fixed , minimizing for given power, and employing adaptive early-stopping rules where feasible to reduce expected sample size while preserving evidentiary value (Held et al., 21 Nov 2025).