Selfish mining under general stochastic rewards (2502.20360v1)

Published 27 Feb 2025 in cs.GT

Abstract: Selfish mining, a strategy where Proof-of-Work consensus participants selectively withhold blocks, allows miners to earn disproportionately high revenue. The vast majority of the selfish mining literature focuses exclusively on block rewards. Carlsten et al. [2016] is a notable exception, which observes that similar strategic behavior may be profitable in a zero-block-reward regime if miners are compensated with transaction fees alone. As of February 2025, neither model fully captures miner incentives. The block reward remains 3.125 BTC, yet some blocks yield significantly higher revenue. For example, congestion during the launch of the Babylon protocol in August 2024 caused transaction fees to spike from 0.14 BTC to 9.52 BTC, a $68\times$ increase in fee rewards within two blocks. We present a framework for considering strategic behavior under more general miner reward functions that could be stochastic, variable in time, and/or ephemeral. This model can capture many existing reward sources (sometimes called Miner/Maximal Extractable Value or MEV) in blockchains today. We use our framework to examine the profitability of cutoff selfish mining strategies for any reward function identically distributed across forks. Our analysis requires a novel reward calculation technique to capture non-linearity in general rewards. We instantiate these results in a combined reward function that much more accurately represents miner incentives as they exist in Bitcoin today. This reward function includes block rewards and linear-in-time transaction fees, which have been studied in isolation. It also introduces a third random reward motivated by the aforementioned transaction fee spike. This instantiation enables us to (i) make qualitative observations, (ii) make quantitative claims, and (iii) confirm the theoretical analysis using Monte Carlo simulations.

Collections

Summary

The paper introduces a framework and novel reward calculation technique for analyzing selfish mining profitability under general stochastic reward functions, capturing complex incentives like MEV.
It analyzes β-cutoff selfish mining strategies using a Markov Chain and derives attacker profit equations, instantiated with a reward function combining block rewards, transaction fees, and random MEV.
Results from instantiating a Bitcoin-like reward model demonstrate that considering transaction fees and MEV can significantly decrease the profitability threshold for selfish mining by over 50% compared to block rewards alone and pure selfish mining analysis respectively.

This paper introduces a framework for analyzing strategic behavior in Proof-of-Work consensus mechanisms under general miner reward functions, which can be stochastic, variable in time, and/or ephemeral. These reward functions aim to capture various existing reward sources, including Miner/Maximal Extractable Value (MEV), in modern blockchains. The paper focuses on analyzing the profitability of cutoff selfish mining strategies for reward functions identically distributed across forks, employing a novel reward calculation technique to capture non-linearity in general rewards.

The authors instantiate these results in a combined reward function that represents miner incentives in Bitcoin today. This reward function includes block rewards, linear-in-time transaction fees, and a third random reward based on observed transaction fee spikes. The instantiation enables qualitative observations, quantitative claims, and confirmation of the theoretical analysis using Monte Carlo simulations.

Key contributions of the paper include:

A general reward function model to capture the aggregate incentives for following a specific strategy under distinct revenue streams.
A set of properties characterizing subtleties of blockchain rewards.
A technique to calculate expected attacker profit given an aggregate reward function under mild assumptions about the distribution of the constituent reward sources.
An instantiated reward function combining block reward, transaction fee, and MEV rewards.

The paper begins by defining a stylized model of Proof-of-Work mining with general stochastic rewards, referred to as the Nakamoto Consensus Game (NCG). The NCG models the set of miners $M$ , where each miner $m \in M$ has hashrate $\alpha_m$ . At any time $t$ , there is a public view $V_t$ which consists of the state of the blockchain known to all miners, and a private view $V_t^m$ for each miner $m$ which includes $V_t$ and any additional blocks $m$ knows about. Miners are rewarded for creating blocks on the eventual longest chain, which is modeled as a reward function $R^m(t,V,B,r,B') \to \mathbb{R}$ , where $t$ is the time, $V$ is the view, $B$ is a block in $V$ , $r$ is randomness, and $B'$ is a block created by miner $m$ . Each miner $m$ has a strategy that takes as input a time $t$ , a view $V_t^m$ , and the reward $R^m(t,V_t^m,B,r,B')$ for extending each block $B\in V_t^m$ by a valid block $B'\in\mathcal{B}^m(t,V_t^m,B,r)$ , and outputs (i) a block $B\in V_{t}^m$ to mine on, (ii) contents of the next block $B'\in\mathcal{B}^m(t,V_t^m,B,r)$ , and (iii) a (potentially empty) subset of blocks in $V^m_t\setminus V_t$ to broadcast.

The paper also defines properties of the reward functions, including:

Miner-Independent Rewards: A reward function $R$ is miner-independent if the set of valid views, the set of valid blocks extending each block in those views, and equal rewards from any such valid block are the same for all miners.
View-Independent Rewards: A reward function $R$ is view-independent if the probability of reward $x$ extending block $B_1$ with timestamp $t'$ in view $V_1$ is the same as the probability of reward $x$ extending block $B_2$ with timestamp $t'$ in view $V_2$ .
Static Rewards: A reward function $R$ is static if the probability of reward $x$ extending block $B_1$ at time $t_1$ with timestamp $t_1 - \Delta$ in view $V_1$ is the same as the probability of reward $x$ extending block $B_2$ at time $t_2$ with timestamp $t_2 - \Delta$ in view $V_2$ .
Persistent Rewards: A reward function $R$ is persistent if for all blocks $B'$ mined at time $t$ extending block $B$ and resulting in view $V$ , $R(t,V,B,r,B')$ is bound above by the total rewards minus the sum of claimed rewards on the ancestral chain of $B$ .

The paper then presents examples of reward functions, including transaction fees and Loss-Versus-Rebalancing (LVR), and uses them to illustrate the properties.

The paper analyzes $\beta$ -cutoff selfish mining strategies, where the attacker withholds blocks if the rewards they earn are below a threshold $\beta$ . The attacker follows the longest chain and claims all available rewards but withholds any block found where the time since parent is less than $\beta$ . The paper introduces the $\beta$ -cutoff Markov Chain, which captures the states of the attacker's hidden chain length. Using this Markov Chain, the stationary distribution $p_i$ is calculated as the probability of being in State $i$ . The stationary distribution measures the probability that a block produced in the Markov Chain is orphaned, \begin{align*} \lambda = p_1 (1-\alpha) \left(1+\frac{\alpha}{1-2\alpha}\right). \end{align*}

The per-state attacker rewards, $f_i$ , are then calculated as the expected reward of a canonicalized attacker block mined in State $i$ . For all states $i\geq 2$ , the expected attacker rewards collected in State $i$ is given by: \begin{align*} f_i &= \sum_{j=0}^{i-1} \left[\alpha (1-\alpha)^j \int_{0}^\infty \frac{t^j e^{{-t/(1-\lambda)}}{j!(1-\lambda)^{j+1}}} \mathbb{E}_{r}[R(t)]dt\right] \end{align*}

The attacker's reward is then calculated as: \begin{align*} \text{ATTACKER REWARD} =f_0p_0 + f_1 p_1 + \alpha \sum_{i=2}^\infty f_i p_{i-1}. \end{align*}

The paper instantiates an aggregate reward function, $\hat{R}$ , composed of (1) a fixed block reward of size $C$ , (2) a linear-in-time transaction fee reward, and (3) a random "extra" reward of size $E$ awarded to a block based on the outcome of a Bernoulli trial with probability $p$ . $\hat{R}$ is then defined as: \begin{align*} \hat{R}(t) &= C + t + E \cdot \mathds{1}[X=1], \; X\sim \text{Bernoulli}(p). \end{align*}

With this combined reward function, the paper analyzes the transition probabilities, stationary distribution, and expected attacker rewards. For example, it was demonstrated that the profitability threshold of $\beta-$ cutoff selfish mining decreases by about 22\% and another 31\% compared to pure selfish mining when considering other rewards at $\gamma=0$ .

The paper concludes by discussing potential future research directions, including applying the methodology more broadly to other consensus protocols and expanding the strategy space to capture more realistic mining strategies.

The paper also includes several appendices which provide further details on the derivations and calculations used in the analysis.

Appendix A: Proof of Lemmas
Appendix B: Derivation of $f_3$
Appendix C: Evaluation of $f_{0,(i)}, f_{0,(ii)}, f_{0,(iii)}$ Integrals
Appendix D: Derivation of Full Attacker Reward Equation
Appendix E: Block Rewards Only
Appendix F: Linear-in-Time Transaction Fees

PDF Markdown

Paper Prompts

Explore 10 Community Prompts

Follow-up Questions

Authors (3)

Tweets

https://twitter.com/mikeneuder/status/1895530401374105797