Papers
Topics
Authors
Recent
Search
2000 character limit reached

Testing by Betting: A Statistical Framework

Updated 24 May 2026
  • Testing by Betting is a framework that reinterprets hypothesis testing as a sequential betting game, using an e-process to quantify evidence against a null hypothesis.
  • The method leverages martingale theory and game-theoretic probability to achieve anytime risk control, overcoming limitations of traditional p-value approaches.
  • Extensions address borrowing and mispriced odds by normalizing wealth processes, ensuring that even with leverage, statistical evidence remains valid.

Testing by Betting is a general statistical framework that interprets hypothesis testing as a dynamic betting game, transforming the evaluation of evidence into the language of financial wagers and wealth processes. The central object is an e-process (betting score), which quantifies evidence against a null hypothesis by tracking the cumulative returns of a sequential betting strategy. The core methodology exploits deep connections to martingale theory and game-theoretic probability, enabling uniformly valid, anytime risk control—properties unobtainable by classical p-value methodology. This paradigm has recently been extended to model the effects of leverage (“borrowing”) and “bargaining” for mispriced odds, which pose new trade-offs for the authenticity and interpretation of statistical evidence (Wang et al., 2024).

1. Foundations of Testing by Betting

At its core, testing by betting formalizes statistical evidence as the capital accumulated by a gambler betting sequentially against a null. At time nn, the process observes data X1,X2,,XnX_1, X_2, \dots, X_n adapted to a filtration Fn\mathcal{F}_n. Under the null H0H_0, we require E0[XnFn1]=mn\mathbb{E}_0[X_n | \mathcal{F}_{n-1}] = m_n for predictable mnm_n. A bet bnb_n is Fn1\mathcal{F}_{n-1}-measurable; wagering bnb_n on XnX_n results in a payoff X1,X2,,XnX_1, X_2, \dots, X_n0. The total “wealth” evolves as X1,X2,,XnX_1, X_2, \dots, X_n1. Provided X1,X2,,XnX_1, X_2, \dots, X_n2 remains nonnegative, this process is a (local) martingale under X1,X2,,XnX_1, X_2, \dots, X_n3.

The observed wealth X1,X2,,XnX_1, X_2, \dots, X_n4 is an e-value: large X1,X2,,XnX_1, X_2, \dots, X_n5 indicates strong evidence against X1,X2,,XnX_1, X_2, \dots, X_n6. Ville's inequality provides a nonasymptotic guarantee:

X1,X2,,XnX_1, X_2, \dots, X_n7

so the rule “reject X1,X2,,XnX_1, X_2, \dots, X_n8 at the first X1,X2,,XnX_1, X_2, \dots, X_n9 for which Fn\mathcal{F}_n0” controls Type I error at level Fn\mathcal{F}_n1 at all stopping times.

An e-process is a nonnegative adapted process Fn\mathcal{F}_n2 with Fn\mathcal{F}_n3 for all Fn\mathcal{F}_n4. E-processes generalize p-values to allow “anytime” (optional stopping) guarantees and are typically constructed via suitable betting strategies (Shafer, 2019, Waudby-Smith et al., 2020).

2. Classical No-Borrowing and its Limitations

The canonical constraint in betting-based statistical tests is “no-borrowing”: at step Fn\mathcal{F}_n5, the gambler is forbidden from staking more than their current wealth, i.e., Fn\mathcal{F}_n6, ensuring Fn\mathcal{F}_n7 always. This constraint guarantees that wealth processes are nonnegative martingales or supermartingales, foundational for the validity of tail probabilities and Type I error control via Ville’s inequality.

However, this restriction imposes strict sequential limitations: if Fn\mathcal{F}_n8 ever hits zero, the test cannot proceed. In contrast, in quantitative finance, the use of leverage (betting with borrowed capital) is pervasive. This motivates the question: how do sequential betting-based tests behave once the no-borrowing constraint is relaxed (Wang et al., 2024)?

3. Extensions: Borrowing, Liabilities, and Net Wealth

The extension to allowing borrowing modifies the process dynamics:

  • Let Fn\mathcal{F}_n9 be the borrowed funds at round H0H_00 (H0H_01 is the cumulative liability).
  • The gambler’s gross wealth (before debts are repaid) is H0H_02 with H0H_03 for fair double-or-nothing odds on H0H_04.
  • The net wealth is H0H_05.

Importantly, the net-wealth process H0H_06 is a martingale under H0H_07 and the gross wealth decomposes as H0H_08 (Doob decomposition).

Statistical guarantees are maintained in the borrowed setting under certain controls:

  • Bounded Liabilities: If the expected liability satisfies H0H_09, then the scaled process E0[XnFn1]=mn\mathbb{E}_0[X_n | \mathcal{F}_{n-1}] = m_n0 is an e-process, so the probability of excess evidence is inflated by a factor E0[XnFn1]=mn\mathbb{E}_0[X_n | \mathcal{F}_{n-1}] = m_n1.
  • Lower-bounded Net Wealth: If E0[XnFn1]=mn\mathbb{E}_0[X_n | \mathcal{F}_{n-1}] = m_n2 uniformly, then E0[XnFn1]=mn\mathbb{E}_0[X_n | \mathcal{F}_{n-1}] = m_n3 is an e-process, controlling Type I error depending on the minimum allowed net wealth.

Borrowing-based strategies can always be decomposed as convex combinations (“episodes”) of standard (no-borrowing) e-values (Wang et al., 2024).

4. Leverage Invariance and the Futility of Unconstrained Borrowing

A critical finding is leverage invariance: for any random variable E0[XnFn1]=mn\mathbb{E}_0[X_n | \mathcal{F}_{n-1}] = m_n4, transforming E0[XnFn1]=mn\mathbb{E}_0[X_n | \mathcal{F}_{n-1}] = m_n5 by a leverage map E0[XnFn1]=mn\mathbb{E}_0[X_n | \mathcal{F}_{n-1}] = m_n6, and standardizing it to match e-value properties, does not increase the maximal achievable evidence under any alternative. All borrow-adjusted e-values are equivalent, in maximal expectation, to standard (no-borrowing) e-values after proper normalization.

This means that systematic “tricks” with leverage or variable borrowing cannot outpace the best possible evidence achievable via conventional, no-borrowing betting strategies. Intuitive attempts to “turbocharge” sequential tests by exploiting borrowing are rendered futile after proper adjustment for liabilities.

5. Bargaining and Arbitrage: Mispriced Odds and Interest

When the “odds” offered in the betting game are mispriced in the statistician’s favor (e.g., a payoff factor E0[XnFn1]=mn\mathbb{E}_0[X_n | \mathcal{F}_{n-1}] = m_n7 on a correct bet for some E0[XnFn1]=mn\mathbb{E}_0[X_n | \mathcal{F}_{n-1}] = m_n8), risk-free arbitrage becomes possible. This artificially inflates the gambler’s wealth even under the null, introducing misleading or unbounded evidence.

To compensate, one divides the gross wealth E0[XnFn1]=mn\mathbb{E}_0[X_n | \mathcal{F}_{n-1}] = m_n9 by the product of all mispricing factors—i.e., computes the “numéraire-adjusted” wealth mnm_n0—which restores an (almost) supermartingale structure (Wang et al., 2024). A maximal inequality shows that unless the sum mnm_n1 is vanishingly small, the claimed evidence cannot be interpreted as statistical, since arbitrage dominates.

Interest-accrual on loans (charging the same rate mnm_n2 as the mispricing at each step) realigns the statistical and financial notion of fair evidence: only then does net wealth, properly adjusted, regain the true martingale property, restoring correct e-process interpretation.

6. Implications, Pathologies, and Statistical Principles

Testing by betting offers several crucial advantages:

  • Anytime validity: E-processes guarantee Type I control regardless of stopping rules, unlike p-values, which only control level at a prespecified sample size (Shafer, 2019, Waudby-Smith et al., 2020).
  • Multiplicative accumulation: Evidence builds multiplicatively by default, in contrast to the shrinking properties of p-values under repeated testing.
  • Weak assumptions: The martingale property and e-process definition often hold under minimal parametric or structural requirements; they are robust to misspecification, adversarial manipulation, and, with appropriate adjustment, even to borrowing and mispriced odds.

However, unrestrained borrowing or acceptance of mispriced odds without proper interest adjustments fundamentally undermines the validity of betting-based evidence. In such cases, the e-process may report spurious evidence due to pure arbitrage rather than statistical truth. All valid extensions must invoke normalization or constraint factors (such as mnm_n3 for liabilities, or numéraire adjustment for odds) to maintain statistical meaning.

As concrete illustration, in a biased coin test with betting at the Kelly-optimal fraction, the gambler's log-wealth typically grows at rate mnm_n4 under the alternative mnm_n5; the corresponding p-value decays at this rate, making e-values exponentially more informative in the presence of signal.

7. Summary Table: Core Objects in Borrowing/Bargaining Extensions

Object Mathematical Definition Statistical Role
Gross wealth mnm_n6 Measures cumulative evidence (adjusted for debt)
Liabilities mnm_n7 Total amount borrowed
Net wealth mnm_n8 True value after debts
Debt-adjusted e-value mnm_n9, bnb_n0 Controls Type I error at adjusted levels
Numéraire-adjusted bnb_n1 Restores supermartingale after mispricing

Debt- and arbitrage-adjusted statistics are indispensable for maintaining interpretability and risk control in applications of testing by betting that go beyond the classical no-borrowing regime (Wang et al., 2024).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Testing by Betting.