Multi-Period Distributionally Robust Optimization

Updated 26 November 2025

Multi-period DRO is a mathematical framework for sequential decision-making under uncertainty using Wasserstein ambiguity sets.
It employs dynamic programming and recursive value function computation to guarantee worst-case performance across time periods.
The approach bridges discrete-time methods and continuous-time limits, incorporating PDE characterizations with online data-driven adaptations.

Multi-period distributionally robust optimization (DRO) is a mathematical framework designed to optimize sequential decision-making under evolving, distributional uncertainty over time. Unlike conventional stochastic optimization (SO)—which requires a known probability distribution—DRO operates over ambiguity sets that contain all plausible distributions, ensuring worst-case guarantees. In the multi-period setting, these decisions and ambiguities are dynamically composed over several time steps, leading to intricate dynamic programming, PDE, and semigroup characterizations. Two primary and complementary approaches have emerged: (i) a semigroup/sensitivity framework for scaling limits in continuous time, particularly via Wasserstein ambiguity, and (ii) an online, data-driven approach for learning and adapting ambiguity sets from scenario data streams. Both lines connect multi-period DRO to dynamic programming, stochastic control, and robust Markov decision processes.

1. Discrete-Time Multi-Period DRO Formulation

The multi-period DRO problem is formulated on a time-indexed, discrete finite horizon. At each period, a reference dynamic, modeled as a transition kernel $P^a_\varepsilon(x,\cdot)$ indexed by action $a \in A$ , specifies the nominal law of motion. The user considers adversarial perturbations of this law within a Wasserstein ball of radius proportional to the step size: $B^a_\varepsilon(m) = \{\nu \in \mathcal{P}_p(\mathbb{R}^d) : W_p(\mu^a_\varepsilon, \nu) \le m\varepsilon\}$ , where $m\ge 0$ is the robustness parameter.

Backwards dynamic programming induces value functions recursively: $V_k(x) = \inf_{a \in A} \sup_{\nu \in B^a_\varepsilon(m)} \mathbb{E}_{\nu}\left[V_{k+1}(\psi^a_\varepsilon(x) + Z)\right], \quad k = N-1,\ldots,0,$ with $V_N(x) = h(x)$ for terminal cost $h$ (Nendel et al., 25 Nov 2025). This compositional min-max approach generates the fundamental multi-period structure: at each period, the agent selects an action, nature selects a distribution within the ambiguity set, and the process evolves.

2. Scaling Limits and Semigroup Perspective

As the period length $\varepsilon \to 0$ and the number of periods $N \to \infty$ (with $N\varepsilon = T$ fixed), the discrete-time composition yields a monotone strongly continuous (“C₀”) semigroup $(S_t)_{t\geq 0}$ on $C_b(\mathbb{R}^d)$ (bounded, continuous functions), arising as the scaling limit of the $N$ -step operator. Generalizing to arbitrary partitions $\pi$ of the interval $[0,T]$ , refinement in the mesh leads to convergence in the mixed topology: $S^\varepsilon_T f = \inf_{\text{partitions }\pi,\, \text{mesh}(\pi)\leq\varepsilon} I(t_1)\circ\cdots\circ I(T-t_{n-1})f \to S_T f \quad \text{as}\ \varepsilon \to 0.$ Under mild regularity, this abstraction establishes that the multi-period DRO framework admits a well-posed continuous-time limit with strong functional-analytic structure (Nendel et al., 25 Nov 2025).

3. Generator Characterization and Nonlinear PDE

The infinitesimal generator $A$ of the semigroup $(S_t)_{t\geq 0}$ is given by $A = L + H$ : $A f(x) = \inf_{a\in A} L^a f(x) + m\|\nabla f(x)\|,$ where $L^a$ is the generator of the nominal process (e.g., if $P^a_\varepsilon$ is from an Itô diffusion, $L^a f(x) = \langle b(a,x), \nabla f(x)\rangle + \tfrac{1}{2}\operatorname{Tr}[\sigma(a,x)\sigma(a,x)^\top \nabla^2 f(x)]$ ), and $H f(x) = m\|\nabla f(x)\|$ encodes the local worst-case sensitivity due to Wasserstein ambiguity (Nendel et al., 25 Nov 2025).

The value function $u(t,x)=S_t h(x)$ is characterized as the unique bounded viscosity solution to the nonlinear PDE

$\partial_t u + \inf_{a \in A}\left[\langle b(a,x), \nabla u\rangle + \frac{1}{2}\operatorname{Tr}[\sigma(a,x)\sigma(a,x)^\top \nabla^2 u]\right] + m\|\nabla u\| = 0, \quad u(0, x) = h(x),$

which can also be interpreted in a robust control or differential game context.

4. Online Data-Driven Multi-Period DRO

Absent knowledge of the true underlying distribution, a data-driven approach constructs and updates ambiguity sets $P_t$ online as more scenario data become available. At each period $t$ , after observing scenario $s_t$ , the ambiguity set is tightened (e.g., via confidence-interval, $\ell_2$ -Wasserstein-type, or kernel-metric sets), ensuring with high probability that the true distribution $p^*$ remains inside: $P_t = \{p \in \Delta_S : d(p, \hat{p}_t) \le \varepsilon_t\},$ with $\varepsilon_t = O(\sqrt{\log t / t})$ , so $P_t$ shrinks to $\{p^*\}$ as $t \to \infty$ (Aigner et al., 2023).

The robust decision $x_t$ in each period is determined by an embedded two-player online game:

Nature (“ $p$ -player”) chooses $p_t \in P_{t-1}$ to maximize expected cost given $x_{t-1}$ ;
The agent (“ $x$ -player”) chooses $x_t \in \mathcal{X}$ to minimize expected cost under $p_t$ .

This leads to an online gradient-descent procedure in which only a projection and a standard SO step are needed at each round, never the full min-max. The algorithm exhibits dynamic regret $O(\log T / \sqrt{T})$ and consistency of decisions $x_t$ to the SO optimum (Aigner et al., 2023).

5. Computational and Analytical Characteristics

Table: Discrete-Time vs. Data-Driven Multi-Period DRO

Aspect	Discrete-Time/Semigroup Approach	Data-Driven Online Approach
Uncertainty model	Wasserstein ball around nominal kernel	Ambiguity sets (confidence/Wasserstein/kernel)
Dynamics	Sequential composition of one-step DROs	Scenario stream, empirical update
Limiting object	Nonlinear semigroup, HJB-type PDE	Regret-optimal online iterates
Computational procedure	Backward recursion, semigroup construction	Projected gradient, expectation minim./period
Convergence/Regret	Viscosity solution uniqueness (Crandall–Lions)	$O(\log T/\sqrt{T})$ dynamic regret

The semigroup approach enables rigorous analysis via nonlinear operator theory, Taylor expansion sensitivity for the Wasserstein term, and viscosity solution arguments for the limiting PDE (Nendel et al., 25 Nov 2025). The data-driven method leverages online convex optimization techniques and a robust treatment of time-evolving constraints, bounding the “path-length” of ambiguity set changes to analyze regret (Aigner et al., 2023).

6. Applications and Empirical Performance

Multi-period DRO has direct applications in sequential decision-making under uncertainty, such as network routing under uncertain travel times. Empirical studies demonstrate that data-driven online multi-period DRO achieves rapid reduction in robust costs, converging to optimal stochastic solutions with sublinear regret and order-of-magnitude reductions in per-period computational overhead compared to fully reformulated static DRO (e.g., from several seconds to fractions of a second per period in large routing instances) (Aigner et al., 2023). The continuous-time scaling limit provides rigorous connections between data-driven iterative algorithms, control-theoretic interpretations, and PDE characterizations of robust value functions (Nendel et al., 25 Nov 2025).

7. Theoretical Challenges and Structural Insights

Key technical challenges stem from non-convexity of one-step DRO operators (due to the inf–sup structure), projective limits of operator families, and the necessity of monotonicity and strong continuity for the limiting semigroup. Mixed-topology convergence plays a central role, ensuring that operator sequences yield well-defined continuous-time dynamics. In online frameworks, drift in the ambiguity sets necessitates fine-grained path-length control, with performance guarantees achieved by bounding the Hausdorff distances between successive sets (Nendel et al., 25 Nov 2025, Aigner et al., 2023).

The bridge established from discrete-time multi-period Wasserstein DRO to continuous-time robust stochastic control and viscosity PDEs provides a unified perspective that connects optimization, probability, analysis, and learning-theoretic viewpoints on robust sequential decisions.