Papers
Topics
Authors
Recent
Search
2000 character limit reached

Stochastic Dual Dynamic Programming

Updated 13 March 2026
  • SDDP is a decomposition algorithm that approximates cost-to-go functions in multistage stochastic optimization by representing them as a supremum of affine cuts.
  • It employs both single-cut and multicut strategies to efficiently handle dynamic programming recursion and balance approximation accuracy with computational tractability.
  • Advanced cut management techniques such as Level 1 and LML 1 selectors optimize performance by limiting the number of active cuts while ensuring convergence.

Stochastic Dual Dynamic Programming (SDDP) is a fundamental decomposition-based algorithm for the numerical solution of multistage stochastic linear and convex optimization problems with recourse. SDDP has achieved prominence in operations research, stochastic control, and energy system planning, owing to its ability to efficiently approximate cost-to-go (value) functions as polyhedral lower envelopes (suprema of affine cuts), thus mitigating the curse of dimensionality inherent in classical scenario tree and dynamic programming approaches.

1. Problem Class and Dynamic Programming Recursion

SDDP targets the T-stage risk-neutral multistage stochastic linear program, classically written in dynamic programming form as follows. At each stage t=1,,Tt=1,\ldots,T, a decision xtRnx_t \in \mathbb{R}^n is made after observing the current realization of uncertainties ξt\xi_t, which are assumed to have finite discrete support and stagewise independence. The value functions satisfy nested Bellman recursions: Vt(xt1)=Eξt[Qt(xt1,ξt)],VT+10,V_t(x_{t-1}) = \mathbb{E}_{\xi_t}\left[ Q_t(x_{t-1}, \xi_t) \right], \qquad V_{T+1} \equiv 0, where

Qt(xt1,ξt)=minxt0{ ct(ξt)xt+Vt+1(xt)At(ξt)xt+Bt(ξt)xt1=bt(ξt) }.Q_t(x_{t-1}, \xi_t) = \min_{x_t \geq 0}\{ \ c_t(\xi_t)^\top x_t + V_{t+1}(x_t) \mid A_t(\xi_t)x_t + B_t(\xi_t)x_{t-1} = b_t(\xi_t) \ \}.

Assumptions for convergence include nonempty, bounded recourse sets for all feasible histories; independent, finite-support random data beyond stage 1; and the ability to solve each stage linear program to an extreme point solution (Guigues et al., 2019).

2. Algorithm Structure: Single-Cut and Multicut SDDP

SDDP algorithms decompose the dynamic program by recursively approximating each cost-to-go function from below using a supremum of affine functions ("cuts") derived from dual solutions of backward pass subproblems.

  • Single-Cut SDDP maintains, for each stage tt, a polyhedral under-approximation

V^tk(xt1)=maxik{αti+βtixt1}\hat V_t^k(x_{t-1}) = \max_{i \leq k} \left\{ \alpha_t^i + \beta_t^{i\top} x_{t-1} \right\}

generated at trial points xt1ix_{t-1}^i from the iith forward pass.

  • Multicut SDDP (MuDA), instead of collapsing the expectation in VtV_t, constructs for each scenario ξtj\xi_{tj} at stage tt a local recourse approximation and then averages:

Qtk(xt1,ξtj)=maxik{θtji+βtjixt1},V^tk(xt1)=j=1MtptjQtk(xt1,ξtj).\mathfrak Q_t^k(x_{t-1}, \xi_{tj}) = \max_{i \leq k} \left\{ \theta_{tj}^i + \beta_{tj}^{i\top} x_{t-1} \right\}, \quad \hat V_t^k(x_{t-1}) = \sum_{j=1}^{M_t} p_{tj}\, \mathfrak Q_t^k(x_{t-1}, \xi_{tj}).

In both frameworks, each new cut is affine and valid globally.

The forward pass samples a trajectory through the scenario tree, solving a deterministic chain of LPs using the current cut approximations. The backward pass, for each trajectory state and scenario, solves the dual LP to generate a new cut at the visited point (Guigues et al., 2019).

3. Cut Management: Selection Strategies

The number of cuts grows linearly with the number of iterations, so cut selection (pruning) becomes crucial for computational efficiency. (Guigues et al., 2019) introduces a selector framework:

  • Level 1: Keep every cut that has ever been active (maximal) at any trial point. Guarantees the same lower-bound evolution as unpruned SDDP, but may accumulate many cuts.
  • Limited Memory Level 1 (LML 1): For each trial point, retain only the oldest cut among all those tied for maximality. This is far more aggressive, drastically reducing the number of active cuts.

At each backward pass, active cuts are recomputed for all historical trial points and then updated according to the selection rule. This approach applies both to single-cut SDDP and its multicut analogs.

4. Convergence Analysis

Under the standing assumptions (stagewise independence, finite support, nonempty/bounded recourse, independent path sampling, and exact LP solutions), SDDP with any selector satisfying a monotonicity property (e.g., Level 1 or LML 1) exhibits almost-sure finite convergence:

  • There exists k0<k_0 < \infty such that for all kk0k \geq k_0, both the pool of locally active cuts
Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Stochastic Dual Dynamic Programming (SDDP).