Kelly Betting as Bayesian Model Evaluation

Updated 15 February 2026

The paper introduces Kelly betting as a principled method for real-time Bayesian model evaluation, equating wealth growth with posterior credibility updates.
It demonstrates the equivalence between optimal wagering, minimizing cumulative log-loss, and Bayesian updating through analytic links to KL divergence and regret bounds.
The method supports online updating and market consensus formation using both full and fractional Kelly betting, leading to enhanced model selection accuracy.

Kelly betting as Bayesian model evaluation provides a mathematically rigorous framework for real-time, sequential assessment of probabilistic forecasting models. By treating each model or agent as a Kelly bettor and interpreting their evolving bankrolls as Bayesian credibilities, this approach unifies prediction-market dynamics, strictly proper scoring, information-theoretic optimality, and Bayesian model averaging. It yields analytic connections between log-loss, Kullback-Leibler divergence, and the rate at which the best model can be distinguished from suboptimal alternatives, while also supporting online updating and market consensus formation (Beygelzimer et al., 2012, &&&1&&&).

1. Mathematical Foundations and Setup

Consider $K$ competing models $M_1, \dots, M_K$ forecasting a sequence of binary outcomes $o_t \in \{0,1\}$ for $t = 1, \ldots, T$ . Each model $i$ outputs an updated predictive probability $p_{i,t} = P_{M_i}(o_t = 1 \mid \text{history})$ . Each model is assigned a bankroll (credibility) $B_{i,t}$ , initialized to the prior $B_{i,0}=P(M_i)$ with normalization $\sum_i B_{i,0}=1$ (Beuoy, 10 Feb 2026).

At each round, models bet as Kelly agents against a “market” consensus probability $q_t$ , with bets fractions given by the classical Kelly formula: $b_{i,t} = \frac{p_{i,t} - q_t}{1 - q_t}.$ Once the outcome $o_t$ is revealed, each model’s bankroll is updated multiplicatively: $B_{i,t} = B_{i,t-1} \begin{cases} 1 + b_{i,t} \bigl(1/q_t - 1\bigr), & o_t=1 \ 1 - b_{i,t}, & o_t=0 \end{cases}$ or, equivalently,

$B_{i,t} = B_{i,t-1} \left(\frac{p_{i,t}}{q_t}\right)^{o_t} \left(\frac{1 - p_{i,t}}{1 - q_t}\right)^{1-o_t}$

(Beygelzimer et al., 2012, Beuoy, 10 Feb 2026).

2. Equivalence to Bayesian Model Evaluation

The growth of $B_{i,t}$ implements exact Bayesian updating for model credibility: $B_{i,t} \propto P(M_i) \prod_{s=1}^t P(o_s | M_i).$ This alignment is seen by noting $P(o_t|M_i) = p_{i,t}^{o_t}(1-p_{i,t})^{1-o_t}$ and that the “market” aggregates model forecasts into $q_t = \sum_i B_{i, t-1}\, p_{i,t}$ . The ratio $B_{i,t}/B_{j,t}$ exactly matches the posterior odds

$\frac{B_{i,t}}{B_{j,t}} = \frac{P(M_i | o_{1:t})}{P(M_j | o_{1:t})}.$

This shows that Kelly betting yields the same sequential model evidence as Bayesian filtering, with bankrolls serving as normalized posterior credibilities at every time step (Beygelzimer et al., 2012, Beuoy, 10 Feb 2026).

3. Market Aggregation, Log-Loss, and Regret

At equilibrium, market price is the consensus forecast

$q_t = \sum_{i=1}^K B_{i,t-1} p_{i,t},$

with $B_{i,t-1}$ interpreted as normalized model credibilities. The incremental log-growth for model $i$ satisfies: $\Delta \ln B_{i,t} = o_t \ln\frac{p_{i,t}}{q_t} + (1-o_t)\ln\frac{1-p_{i,t}}{1-q_t} = -[\ell(o_t, p_{i,t}) - \ell(o_t, q_t)]$ where $\ell(o, p) = -[o\ln p + (1-o)\ln(1-p)]$ is the log-loss. Thus, maximizing log-bankroll aligns with minimizing cumulative log-loss against the market mixture (Beuoy, 10 Feb 2026).

The expected excess growth rate is given by the negative KL divergence between the true data-generating distribution $p^*$ and model $p$ : $\mathbb{E}_{o \sim p^*} \left[\ln \frac{p}{p^*}\right] = -D_{\text{KL}}(p^* || p).$ A worst-case log regret bound follows via wealth conservation in prediction markets: after $T$ rounds,

$L \leq L_i + \ln\frac{1}{w_{i,0}},$

where $L$ is the market log-loss, $L_i$ is that for agent $i$ , and $w_{i,0}$ is the prior wealth. This is the Bayesian model-evidence penalty term for expert $i$ (Beygelzimer et al., 2012).

4. Posterior Evolution and Beta-Binomial Dynamics

Given a data stream $y_1, ..., y_T \iid \operatorname{Bernoulli}(\pi)$, assign initial wealth $w_{i,0}$ and beliefs $p_{i,0}$ . The market price sequence $p_t$ updates exactly as the posterior mean of a Beta-Binomial model: $\begin{aligned} \alpha_0 &= \sum_i w_{i,0} p_{i,0},\ \beta_0 &= \sum_i w_{i,0}(1-p_{i,0}), \ \alpha_{t+1} &= \alpha_t + x_t, \ \beta_{t+1} &= \beta_t + (1-x_t), \ p_{t+1} &= \frac{\alpha_{t+1}}{\alpha_{t+1} + \beta_{t+1}}. \end{aligned}$ Thus, the market price acts as the posterior predictive mean, reflecting aggregate learning as in Bayesian inference (Beygelzimer et al., 2012).

5. Fractional Kelly, Tempered Posteriors, and Discounting

Fractional Kelly betting generalizes the approach by scaling bet size to a confidence parameter $\lambda \in (0,1)$ : $f_{i,t}^{(\lambda)} = \lambda\,\frac{p_{i,t} - p_t}{1 - p_t}.$ A bettor using fractional Kelly acts as a full Kelly bettor on the tempered belief $\tilde p_{i,t} = \lambda p_{i,t} + (1 - \lambda) p_t$ . The consensus price becomes

$p_t = \frac{\sum_i \lambda_i w_{i,t} p_{i,t}}{\sum_i \lambda_i w_{i,t}}.$

When all $\lambda_i \equiv \lambda$ , this implements a market tracking a discounted Bernoulli process, with the price converging to a time-discounted frequency. Empirically, this yields $p_t \approx d_t$ where

$d_t = \frac{\sum_{s=1}^t \gamma^{t-s} y_s}{\sum_{s=1}^t \gamma^{t-s}},\qquad \gamma \approx \frac{1-\lambda}{1+\lambda}.$

This provides a probabilistic interpretation for fractional Kelly betting and ties it to credibility discounting (Beygelzimer et al., 2012).

6. Empirical Performance and Metric Comparison

In simulation studies involving binary outcome sequences (e.g., “volleyball" matches to 100 points), Kelly-Bayes evaluation is compared to log-loss and Brier score for the task of model selection:

When the alternative model uses an incorrect but fixed $p$ , Kelly selects the true model more often than log-loss/Brier (e.g., 55% vs 50%).
For alternatives with recency bias, Kelly achieves substantially higher model-picking accuracy (96% vs 73% for log-loss).
For alternatives with random drift, Kelly outperforms log-loss (74% vs 58%).

Over repeated matches, when the ending bankroll is carried over as prior, Kelly-Bayes quickly outperforms and dominates these classical metrics in terms of model-selection accuracy (Beuoy, 10 Feb 2026).

7. Real-Time Implementation and Generalizations

At each time step, the market consensus is computed as the bankroll-weighted average of model forecasts: $q_t = \sum_i B_{i,t-1} p_{i,t}.$ Bankroll is updated as above, and normalization ensures $B_{i,t}$ retains the interpretation as posterior credibility. In the multinomial (multi-outcome) case, market-clearing requires solving the eigenvector equation $(P W^\top) m = m$ , where $P$ encodes model outcome probabilities and $W$ is hypothetical terminal wealth.

Pseudocode for the binary case is as follows:

B[i] = prior[i]  # for i = 1...K, sum B = 1

for t in 1...T:
    # 1. Read forecasts
    q = sum(B[i] * p[i] for i in 1...K)
    # 2. Compute Kelly fractions
    for i in 1...K:
        b[i] = (p[i] - q) / (1 - q)
    # 3. Observe outcome o ∈ {0,1}
    for i in 1...K:
        if o == 1:
            B[i] *= (1 + b[i] * (1/q - 1))
        else:
            B[i] *= (1 - b[i])
    # 4. Normalize
    Z = sum(B)
    for i in 1...K:
        B[i] /= Z

(Beuoy, 10 Feb 2026)

Conclusion

Kelly betting yields a formal equivalence between wealth maximization via optimal sequential wagering and Bayesian model evaluation. In both theoretical and empirical terms, this approach implements real-time, order-aware, and posterior-consistent updates of model credibility, recovers traditional Bayesian principles in aggregate, and supports discounted or tempered updates via fractional Kelly. Market prices correspond to posterior-predictive means, and the worst-case regret bounds have the interpretation of Bayesian model-selection penalties. Kelly-based Bayesian evaluation thus provides a principled alternative to classical scoring rules for sequential, real-time model assessment and aggregation (Beygelzimer et al., 2012, Beuoy, 10 Feb 2026).

Markdown Report Issue Upgrade to Chat

References (2)

Learning Performance of Prediction Markets with Kelly Bettors (2012)

Kelly Betting as Bayesian Model Evaluation: A Framework for Time-Updating Probabilistic Forecasts (2026)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Kelly Betting as Bayesian Model Evaluation.