Chambolle–Pock Primal–Dual Algorithm

Updated 18 March 2026

The Chambolle–Pock Primal–Dual Algorithm is a first-order operator-splitting method for structured convex optimization that uses proximal updates and a saddle-point formulation.
It employs non-intrusive primal–dual variable updates with prox-operators, offering rigorous convergence guarantees including O(1/N) ergodic and linear rates under proper parameter rules.
Its flexible design extends to multi-block, stochastic, nonconvex, and preconditioned variants, making it suitable for large-scale imaging, tomography, and distributed optimization.

The Chambolle–Pock Primal–Dual Algorithm, also known as the Primal–Dual Hybrid Gradient (PDHG) method, is a first-order operator-splitting scheme for structured convex optimization and monotone inclusion problems of the form

$\min_{x\in X} g(x) + h(Ax),$

where $g: X \to (-\infty,+\infty]$ and $h: Y \to (-\infty,+\infty]$ are proper, lower semicontinuous convex functions, and $A : X \to Y$ is a bounded linear operator between real Hilbert spaces. The algorithm also has an equivalent saddle-point (primal–dual) formulation and admits robust generalizations to multi-block, stochastic, and nonconvex regimes, unifying and extending several classic optimization methods. The method is notable for its non-intrusive primal–dual variable updates using prox-operators, sharp step size guarantees, and extensibility to preconditioned, accelerated, or block-coordinate variants. Its fixed-point operator perspective and concordant monotonicity analysis permits tight convergence theorems and parameter regions.

1. Variational Formulation and Operator Framework

The canonical variational problem is

$\min_{x\in X} g(x) + h(Ax),$

with Fenchel–Rockafellar dual

$\max_{y\in Y} -g^*(-A^T y) - h^*(y).$

Introducing the saddle-point Lagrangian yields

$\min_{x\in X} \max_{y\in Y} g(x) + \langle A x, y\rangle - h^*(y).$

The KKT conditions for optimality become

$0 \in \partial g(x^*) + A^T y^*,\quad 0 \in \partial h^*(y^*) - A x^*.$

The problem can be equivalently written as a monotone inclusion

$0 \in \begin{pmatrix} \partial g & A^T \ -A & \partial h^* \end{pmatrix} \begin{pmatrix} x \ y \end{pmatrix}.$

This structure enables the interpretation of the method as a (possibly preconditioned) nonexpansive operator splitting on $X\times Y$ , and as a special case of more general primal–dual or three-operator methods (Yan, 2016).

2. Chambolle–Pock Iteration and Parameter Rules

The Chambolle–Pock method generates iterates, for $g: X \to (-\infty,+\infty]$ 0,

$g: X \to (-\infty,+\infty]$ 1

A variant uses over-relaxation $g: X \to (-\infty,+\infty]$ 2: $g: X \to (-\infty,+\infty]$ 3 The choice $g: X \to (-\infty,+\infty]$ 4 yields maximal acceleration in the convex case. The explicit parameter condition for convergence is

$g: X \to (-\infty,+\infty]$ 5

a sharp and efficiently verifiable bound that also guarantees firm nonexpansiveness (1/2-averaged map) in the composite Hilbert product norm (Yan, 2016, Sidky et al., 2011, Sidky et al., 16 Mar 2026). Improved Lyapunov analyses extend the admissible region to $g: X \to (-\infty,+\infty]$ 6 for certain splitting variants and step-size couplings (Li et al., 2022, He et al., 2021, Chang et al., 1 Oct 2025).

The iteration cost matches that of a single evaluation of $g: X \to (-\infty,+\infty]$ 7 and two proximal mappings per iteration, generalizing directly to non-smooth or indicator terms.

3. Convergence and Rate Guarantees

The deterministic Chambolle–Pock method enjoys the following guarantees:

Weak convergence: Iterates converge weakly to a saddle point under the standard step-size condition.
Ergodic rate: The ergodic (Cesàro) primal–dual gap at the averaged iterates $g: X \to (-\infty,+\infty]$ 8 decays as $g: X \to (-\infty,+\infty]$ 9 in the merely convex case:

$h: Y \to (-\infty,+\infty]$ 0

(Yan, 2016, Sidky et al., 2011, Zhu et al., 2022)

Linear convergence: Under strong convexity of $h: Y \to (-\infty,+\infty]$ 1 or $h: Y \to (-\infty,+\infty]$ 2 (or equivalently, their smooth constants), the fixed-point map becomes contractive and iteration converges at $h: Y \to (-\infty,+\infty]$ 3 for some $h: Y \to (-\infty,+\infty]$ 4 (Yan, 2016, Clason et al., 2018).
Non-convex and semiconvex extensions: If $h: Y \to (-\infty,+\infty]$ 5 is $h: Y \to (-\infty,+\infty]$ 6-semiconvex, but $h: Y \to (-\infty,+\infty]$ 7 is $h: Y \to (-\infty,+\infty]$ 8-strongly convex with $h: Y \to (-\infty,+\infty]$ 9, convergence with nonergodic $A : X \to Y$ 0 rate holds for suitable steps (Möllenhoff et al., 2014).

For block-coordinate, stochastic, and preconditioned variants, analogous $A : X \to Y$ 1—and in the (semi-)strongly convex blocks, $A : X \to Y$ 2 or linear—rates apply under matching blockwise or expected-separable-overapproximation parameter rules (Chambolle et al., 2017, Valkonen, 2016, Bilenne, 2024).

4. Generalizations, Preconditioning, and Extensions

Stochastic and Block-Coordinate Variants

Randomized dual updates: Only a random subset of dual (or primal) blocks is updated per iteration, with controlled step-sizes determined by expected separable overapproximation (ESO). These schemes yield the same $A : X \to Y$ 3 ergodic rates with high practical gains in large-scale settings (Chambolle et al., 2017, Luke et al., 2018).
Spatially variable acceleration: Blockwise strong convexity and adaptively chosen step-sizes can give locally accelerated convergence (e.g., $A : X \to Y$ 4 for strongly convex blocks and $A : X \to Y$ 5 otherwise), with fully parallel updates and doubly-stochastic policies (Valkonen, 2016).

Larger Step Sizes and Convex Combinations

Recent analyses have enlarged the admissible step-size region using convex combination steps and generalized nonexpansive operator theory:

Convex-combination extrapolation and parameter extension: A new class of schemes admits maximal $A : X \to Y$ 6 for $A : X \to Y$ 7, realizing significant empirical speedups (Chang et al., 1 Oct 2025).
Generalized splitting and three-term reductions: The Chambolle–Pock iteration is a special case of a three-function monotone inclusion splitting (PD3O) with suitable choices $A : X \to Y$ 8, and such reductions cement its role as the canonical method for bilinear saddle-structure (Yan, 2016).

Preconditioning

Diagonal and non-diagonal preconditioning is naturally integrated to admit ill-conditioning, especially in imaging settings:

Diagonal preconditioning: Step-size matrices $A : X \to Y$ 9 adjusted entrywise as inverses of aggregation of $\min_{x\in X} g(x) + h(Ax),$ 0 columns/rows, enabling improved convergence when primal/dual terms are unbalanced (Sidky et al., 16 Mar 2026).
Non-diagonal preconditioning: Spectral approximations of $\min_{x\in X} g(x) + h(Ax),$ 1 are used to accelerate convergence—especially effective in tomography (Sidky et al., 16 Mar 2026).

Nonconvex, Nonmonotone, and Inexact Operators

Extensions exist to:

Semiconvex/nonconvex regularization: If $\min_{x\in X} g(x) + h(Ax),$ 2 is only semiconvex and $\min_{x\in X} g(x) + h(Ax),$ 3 sufficiently strongly convex, splitting with careful step sizes and parameter choices achieves $\min_{x\in X} g(x) + h(Ax),$ 4 pointwise convergence (Möllenhoff et al., 2014).
Nonmonotone, semimonotone inclusions: By leveraging (oblique) weak-Minty and semimonotonicity conditions on the composite operator, step-size and relaxation parameters can exceed classical bounds; exact dependence on singular values can further characterize admissible regions (Evens et al., 2023).
Mismatched adjoints: When the transpose $\min_{x\in X} g(x) + h(Ax),$ 5 is replaced with an approximation $\min_{x\in X} g(x) + h(Ax),$ 6 (e.g., in practical CT), linear convergence and error bounds still hold if the linear operator mismatch is small and step-sizes are properly scaled (Lorenz et al., 2022).

5. Connections to Other Optimization Schemes

Proximal Point and ADMM Equivalences

The Chambolle–Pock algorithm is a special instance of the weighted proximal point method (PPM) for mixed variational inequalities, reflected in its update using a positive definite metric induced by the parameters $\min_{x\in X} g(x) + h(Ax),$ 7 (Chan et al., 2014). Moreover, its iteration sequence coincides with that of linearized ADMM (LADM) for the primal or dual, up to initialization and block-cycling, and its operator splitting viewpoint unifies ADMM, Douglas–Rachford, and augmented Lagrangian approaches.

Reduction to Primal-Only Schemes

On linearly constrained problems, the primal–dual iterations can be written as entirely primal algorithms, yielding Tseng-type accelerated penalties and allowing efficient distributed implementations with one communication round per iteration (Malitsky, 2017, Bilenne, 2024).

Augmented-Lagrangian and Unified Frameworks

Within the more general framework of augmented-Lagrangian methods and conic-programming, the Chambolle–Pock method can be viewed as a limiting case with no explicit penalty parameter; with penalties, the method further extends to a broader family (GDA, OGDA, SOGDA) while maintaining $\min_{x\in X} g(x) + h(Ax),$ 8 ergodic rates and improved infeasibility decay (Zhu et al., 2022).

6. Practical Implementation and Application Domains

Imaging and Tomography

The Chambolle–Pock method is a "plug-and-play" scheme for prototyping large-scale imaging and inverse problems. It is especially suited for:

Total Variation (TV) regularization for denoising, deblurring, and compressed sensing (Sidky et al., 2011, Sidky et al., 16 Mar 2026);
CT and PET reconstruction involving various data-fidelity and constraint terms, where all required prox-mappings admit closed forms (see Table 1).

Data/Regularizer Term	$\min_{x\in X} g(x) + h(Ax),$ 9 or $\max_{y\in Y} -g^(-A^T y) - h^(y).$ 0	Proximal Map / Update
$\max_{y\in Y} -g^(-A^T y) - h^(y).$ 1	$\max_{y\in Y} -g^(-A^T y) - h^(y).$ 2	$\max_{y\in Y} -g^(-A^T y) - h^(y).$ 3
$\max_{y\in Y} -g^(-A^T y) - h^(y).$ 4	$\max_{y\in Y} -g^(-A^T y) - h^(y).$ 5	$\max_{y\in Y} -g^(-A^T y) - h^(y).$ 6
TV $\max_{y\in Y} -g^(-A^T y) - h^(y).$ 7	$\max_{y\in Y} -g^(-A^T y) - h^(y).$ 8	Shrinkage: $\max_{y\in Y} -g^(-A^T y) - h^(y).$ 9
Nonnegativity	$\min_{x\in X} \max_{y\in Y} g(x) + \langle A x, y\rangle - h^*(y).$ 0	Projection: $\min_{x\in X} \max_{y\in Y} g(x) + \langle A x, y\rangle - h^*(y).$ 1

All operations (matrix-vector, gradient, prox) are data-parallel and suited for GPU implementation, with minimal storage requirements and no need for linesearch or parameter tuning beyond initial spectral estimation (Sidky et al., 2011, Sidky et al., 16 Mar 2026).

Large-Scale, Block, and Distributed Settings

Stochastic and coordinate updates: Efficient use of computation in large-scale and distributed environments, with proven acceleration properties when blockwise strong convexity applies (Chambolle et al., 2017, Valkonen, 2016, Bilenne, 2024).
Distributed consensus optimization: Reduction to one round of communication per iteration in graph-based problems; tight feasibility and objective decay (Malitsky, 2017, Bilenne, 2024).

Nonconvex, Nonmonotone, and Inexact Regimes

Nonconvex PDE-constrained optimization: Extended to nonlinear $\min_{x\in X} \max_{y\in Y} g(x) + \langle A x, y\rangle - h^*(y).$ 2 using testing framework and three-point growth conditions, which yield $\min_{x\in X} \max_{y\in Y} g(x) + \langle A x, y\rangle - h^*(y).$ 3 or linear convergence in semi-strongly convex subsets (Clason et al., 2018).
Block-coordinate, spatially adapted, or inexact prox: Theoretical convergence maintained with minor modifications, and substantial empirical gains possible in high-dimensional or ill-conditioned regimes (Valkonen, 2016, Bilenne, 2024).

7. Rate Optimality, Parameter Tightness, and Frontier Developments

Tightness of step-size bounds: The strictness of the classical and improved $\min_{x\in X} \max_{y\in Y} g(x) + \langle A x, y\rangle - h^*(y).$ 4 bounds can be illustrated by failure modes (cycling) at equality; all recent extensions justify and sharpens the limits via spectral and Lyapunov analyses (Li et al., 2022, Chang et al., 1 Oct 2025).
Parameter heuristics: Practical guidelines suggest estimating $\min_{x\in X} \max_{y\in Y} g(x) + \langle A x, y\rangle - h^*(y).$ 5 by the power method, setting $\min_{x\in X} \max_{y\in Y} g(x) + \langle A x, y\rangle - h^*(y).$ 6, and choosing $\min_{x\in X} \max_{y\in Y} g(x) + \langle A x, y\rangle - h^*(y).$ 7 unless strong convexity warrants adaptive schemes (Sidky et al., 16 Mar 2026).
Frontier directions: Current research investigates the algorithm's robustness under nonmonotonicity and semimonotonicity (Evens et al., 2023), convergence with mismatched operators (Lorenz et al., 2022), and its unification within augmented-Lagrangian or momentum-accelerated first-order frameworks (Zhu et al., 2022, Hamedani et al., 2018).

In summary, the Chambolle–Pock Primal–Dual Algorithm occupies a central position in first-order convex optimization due to its modularity, rigorously characterized convergence, broad applicability, and flexibility for both theoretical generalization and practical, large-scale deployment. Its operator-theoretic perspective continues to inform developments in monotone splitting, composite inclusion problems, and structured nonconvex optimization.

Markdown Report Issue Upgrade to Chat

References (18)

A new primal-dual algorithm for minimizing the sum of three functions with a linear operator (2016)

Convex optimization problem prototyping for image reconstruction in computed tomography with the Chambolle-Pock algorithm (2011)

Notes on the primal-dual algorithm for convex optimization applied to X-ray tomographic image reconstruction (2026)

On the improved conditions for some primal-dual algorithms (2022)

A generalized primal-dual algorithm with improved convergence condition for saddle point problems (2021)

A primal-dual splitting algorithm with convex combination and larger step sizes for composite monotone inclusion problems (2025)

A Unified Primal-Dual Algorithm Framework for Inequality Constrained Problems (2022)

Acceleration and global convergence of a first-order primal--dual method for nonconvex problems (2018)

The Primal-Dual Hybrid Gradient Method for Semiconvex Splittings (2014)

10.

Stochastic Primal-Dual Hybrid Gradient Algorithm with Arbitrary Sampling and Imaging Applications (2017)

11.

Block-proximal methods with spatially adapted acceleration (2016)

12.

Parametrization and convergence of a primal-dual block-coordinate approach to linearly-constrained nonsmooth optimization (2024)

13.

Block-coordinate primal-dual method for the nonsmooth minimization over linear constraints (2018)

14.

Convergence of the Chambolle-Pock Algorithm in the Absence of Monotonicity (2023)

15.

Chambolle-Pock's Primal-Dual Method with Mismatched Adjoint (2022)

16.

Inertial primal-dual algorithms for structured convex optimization (2014)

17.

The primal-dual hybrid gradient method reduces to a primal method for linearly constrained optimization problems (2017)

18.

A Primal-Dual Algorithm with Line Search for General Convex-Concave Saddle Point Problems (2018)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Chambolle–Pock Primal–Dual Algorithm.

Data/Regularizer Term	$\min_{x\in X} g(x) + h(Ax),$ 9 or $\max_{y\in Y} -g^(-A^T y) - h^(y).$ 0	Proximal Map / Update
$\max_{y\in Y} -g^(-A^T y) - h^(y).$ 1	$\max_{y\in Y} -g^(-A^T y) - h^(y).$ 2	$\max_{y\in Y} -g^(-A^T y) - h^(y).$ 3
$\max_{y\in Y} -g^(-A^T y) - h^(y).$ 4	$\max_{y\in Y} -g^(-A^T y) - h^(y).$ 5	$\max_{y\in Y} -g^(-A^T y) - h^(y).$ 6
TV $\max_{y\in Y} -g^(-A^T y) - h^(y).$ 7	$\max_{y\in Y} -g^(-A^T y) - h^(y).$ 8	Shrinkage: $\max_{y\in Y} -g^(-A^T y) - h^(y).$ 9
Nonnegativity	$\min_{x\in X} \max_{y\in Y} g(x) + \langle A x, y\rangle - h^*(y).$ 0	Projection: $\min_{x\in X} \max_{y\in Y} g(x) + \langle A x, y\rangle - h^*(y).$ 1