Zero-Sum Dynkin Game Theory

Updated 20 October 2025

Zero-sum Dynkin game is a two-player stochastic optimal stopping game where each player chooses a stopping time to maximize or minimize payoff under zero-sum conditions.
The game’s value is characterized using dynamic programming, backward stochastic PDEs, and variational inequalities that define optimal stopping regions and free boundaries.
Advanced techniques, including martingale decompositions and randomized equilibria, ensure the existence, uniqueness, and regularity of solutions in both Markovian and non-Markovian frameworks.

A zero-sum Dynkin game is a two-player stochastic optimal stopping game characterized by the existence of a value and explicit saddle-point stopping strategies, most often formulated in continuous-time over stochastic processes, typically controlled diffusions or Markov processes. In these games, each player selects a stopping time—potentially based on different information structures—with the aim of maximizing (or minimizing) an expected reward or cost, resulting in a zero-sum outcome. The mathematical theory of zero-sum Dynkin games connects deep probabilistic, partial differential equations (PDE), and variational inequality frameworks, enabling robust qualitative and quantitative characterizations of equilibria, regularity of value functions, and optimal stopping regions.

1. Formal Definition and Classical Formulation

A zero-sum Dynkin game is defined on a filtered probability space with two players (maximizer and minimizer) each choosing an admissible stopping time, $\tau$ and $\sigma$ , respectively. The typical payoff functional is

$R(\tau, \sigma) = f_\tau 1_{\{\tau < \sigma\}} + g_\sigma 1_{\{\sigma < \tau\}} + h_\tau 1_{\{\tau = \sigma\}},$

where $f$ , $g$ , $h$ are $\mathbb{F}$ -adapted processes (often assumed RCLL or continuous, sometimes with more structure). The maximizer's goal is to select $\tau$ to maximize $E[R(\tau, \sigma)]$ while the minimizer chooses $\sigma$ to minimize this expectation. In classical (ordered) settings, typically $f \leq h \leq g$ holds, making $h$ an intermediary simultaneous-stopping payoff (Guo, 2020).

The value (saddle point) $V$ is established via

$V = \sup_{\tau} \inf_{\sigma} E[R(\tau, \sigma)] = \inf_{\sigma} \sup_{\tau} E[R(\tau, \sigma)]$

provided that a saddle point exists.

2. Dynamic Programming, Variational Inequalities, and BSPDEs

The zero-sum Dynkin game's value function is characterized via a dynamic programming principle related to variational inequalities with two obstacles. In Markovian settings (potentially with random coefficients), the value $V(t, x)$ solves a backward stochastic partial differential variational inequality (BSPDVI): $\begin{aligned} dV_t &= -(\mathcal{L}V_t + \mathcal{M}Z_t + f_t)\,dt + Z_t\,dB_t, \ L_t \leq V_t &\leq U_t, \end{aligned}$ where $\mathcal{L}$ and $\mathcal{M}$ are (possibly second-order) differential operators, $f_t$ is a running cost, and $Z_t$ is a control process (Tang et al., 2011). The solution is constrained between lower ( $L_t$ ) and upper ( $U_t$ ) obstacles, i.e., $V_t \in [L_t, U_t]$ . The verification theorem proves that a strong solution to the BSPDVI fully characterizes the value and that the optimal stopping strategies for both players are the first times $V_t$ meets $U_t$ or $L_t$ , i.e.: $\begin{aligned} \tau_1^* &= \inf\{ s \geq t : V_s(X_s) = U_s(X_s) \}, \ \tau_2^* &= \inf\{ s \geq t : V_s(X_s) = L_s(X_s) \}. \end{aligned}$ This approach generalizes the classical deterministic or Markovian PDE variational inequality method by incorporating randomness in coefficients and solutions (Tang et al., 2011).

A salient result is the existence, uniqueness, and continuous dependence of the strong solution; comparison theorems for the BSPDVI ensure monotonicity, enabling definition and qualitative paper of free boundaries (stopping surfaces) in the state space.

3. Equilibrium Strategies and Martingale Characterizations

In the most general (possibly non-Markovian and information asymmetric) setting, the equilibrium (saddle point) is achieved in possibly randomized stopping strategies, characterized using supermartingale and submartingale value process constructions and Doob–Meyer decompositions (Angelis et al., 17 Oct 2025). Given the equilibrium strategies $(\xi^*, \zeta^*)$ generated by randomised stopping rules, the equilibrium value process (an optional semimartingale aggregated over stopping times) satisfies strong martingale or supermartingale properties:

The process

$\widehat{V}^{(1)}_\theta = \essinf_{\tau} E[ f(\tau)(1-\zeta^*_\tau) + \int_{[\theta,\tau)} g(u) du + h(\tau) \Delta\zeta^*_\tau | \mathcal{F}^1_\theta ]$

is a martingale along the optimal strategies.

Complementary slackness conditions (e.g., for the minimizer: $\int (Y^1_t) d\xi^*_t = 0$ ) ensure that the cumulative increasing process in the Doob–Meyer decomposition acts only when the value process equals the immediate stopping payoff.

This martingale framework enables explicit characterization and verification of equilibrium stopping times, robust to general filtrations and path-dependent reward structures (Angelis et al., 17 Oct 2025).

4. Non-Markovian, Asymmetric, and Mean-Field Extensions

Zero-sum Dynkin games have been systematically extended to:

Non-Markovian settings with partial/asymmetric information: Value and equilibrium strategies are characterized by martingale and functional analytic methods rather than PDEs. The game admits a value in randomised stopping times for very general (càdlàg, integrable) reward processes, and the existence of a saddle point is established via Sion's min–max theorem under regularity/order assumptions (Angelis et al., 2020).
Incomplete and asymmetric information: The uninformed player typically uses pure (adapted) stopping times, and the informed player exploits randomized strategies, with the value function governed by a system of quasi-variational inequalities (QVIs) with nonstandard constraints. The filtering of hidden states leads to an expanded state-space (e.g., including likelihood ratio/belief processes) and the construction of equilibrium via coupled free-boundary problems (Angelis et al., 2018).
Mean-field interaction: In games with many players or where the reward depends on the law of the process, the value is characterized by doubly reflected mean-field backward stochastic differential equations (BSDEs). Existence, uniqueness, and propagation of chaos results ensure that as the number of interacting games grows, the values converge to the mean-field Dynkin game's value (Djehiche et al., 2022).

5. Existence and Structure of (Randomized) Equilibria

For general (possibly unordered) payoff processes, pure strategy Nash equilibria may not exist. Sufficient and necessary conditions have been developed (Guo, 2020, Christensen et al., 2023, Christensen et al., 12 Dec 2024):

When the payoffs satisfy $f \leq h \leq g$ (or $X \leq Z \leq Y$ in discrete time), the game admits pure strategy equilibria in threshold stopping times (first entry into specific regions).
With unordered payoffs, the existence of a value and an $\epsilon$ -Nash equilibrium can still be guaranteed—often in Markovian randomized stopping times. Randomized equilibria are constructed by associating to each state a stopping intensity (state-dependent) and defining stopping as the first time an integrated intensity process exceeds an exponential random variable.
Sufficient conditions for the existence of pure equilibrium include a “middle-value” condition: the simultaneous-stopping payoff $h(x)$ is between the unilateral payoffs for every state. Otherwise, equilibria require randomization (e.g., Markovian state-dependent randomization or local time randomization at isolated points) (Guo, 2020, Christensen et al., 12 Dec 2024).

The explicit construction of such equilibria—allowing both Lebesgue-integrated and singular (local time) components—enables complete characterization even in diffusion-driven, non-ordered, or non-symmetric frameworks.

6. Regularity, Free Boundaries, and Numerical Schemes

The regularity of the value function (continuity, $C^1$ regularity) and the structure of optimal stopping regions (free boundaries) are intensively studied:

When problem data are regular, the value satisfies regularity properties (e.g., global $C^1$ regularity in bi-dimensional incomplete information problems (Angelis et al., 2017)).
Free boundaries are defined as surfaces where the value function meets an obstacle, marking transitions between continuation and stopping regions; their monotonicity and continuity can often be shown under monotonicity or comparison theorem arguments (Tang et al., 2011).
For implementation, recent methods employ structure-preserving convex approximations, including neural networks trained to approximate the value in space and dynamic programming recursions, with explicit convexification steps to preserve the theoretical properties required by the game's structure (Baňas et al., 2023).

7. Applications and Further Developments

Zero-sum Dynkin games and their analytic theory provide the foundation for:

Game options and Israeli option pricing in mathematical finance, where both parties have terminating rights and the value function captures the robust hedging price (Angelis et al., 2018, Angelis et al., 2017).
Singular stochastic control, as the value of some singular control problems can be written as the solution of a Dynkin game, linking reflected diffusions and HJB variational inequalities (Yang, 2014).
Markov and non-Markov switching regimes, partially observed control, and mean-field insurance or investment problems (Djehiche et al., 2022).

The robust theoretical apparatus—combining martingale methods, PDE/BSPDE analysis, and constructive algorithmic techniques—enables treatment of a wide class of stopping games under diverse information and dynamical constraints, with significant implications for both theory and quantitative applications in economics, finance, and operations research.

Key References:

Characterization via BSPDVI: (Tang et al., 2011)
Randomized equilibrium construction: (Christensen et al., 2023, Christensen et al., 12 Dec 2024)
Martingale theory and non-Markovian settings: (Angelis et al., 17 Oct 2025, Angelis et al., 2020)
Equilibria with unordered payoff processes: (Guo, 2020)
Mean-field framework: (Djehiche et al., 2022)
Neural network and structure-preserving numerical schemes: (Baňas et al., 2023)