Papers
Topics
Authors
Recent
Search
2000 character limit reached

Brenier’s Theorem: Foundations in Optimal Transport

Updated 27 February 2026
  • Brenier’s Theorem is a fundamental result in optimal transport that ensures the existence and uniqueness of an optimal map, represented as the gradient of a convex function.
  • The theorem leverages duality, c-concavity, and the Monge–Ampère equation to establish optimality and smooth regularity under quadratic cost conditions.
  • Extensions to degenerate and barycentric costs provide insights into numerical algorithms, optimal coupling, and the transition to the Knothe–Rosenblatt rearrangement.

Brenier’s Theorem provides the foundational existence and uniqueness theorem for the Monge–Kantorovich optimal transport problem with quadratic cost. Given Borel probability measures μ and ν on ℝⁿ (or on the flat torus 𝕋ⁿ = ℝⁿ/ℤⁿ), with μ absolutely continuous with respect to Lebesgue measure and both measures having finite second moments, there exists a unique optimal transport pushing μ to ν which minimizes the expected quadratic cost. This unique map is (almost everywhere) the gradient of a convex function and is characterized as the solution to the Monge–Ampère equation. The theorem not only supplies the structure of optimal plans but also relates closely to the theory of convex analysis, PDEs, and the geometry of mass transport.

1. Statement and Mathematical Formulation

Let μ, ν ∈ 𝒫₂(ℝⁿ) be Borel probability measures with finite second moments, and suppose μ ≪ dx. The Monge problem seeks a measurable transport map T: ℝⁿ → ℝⁿ pushing μ onto ν (T♯μ = ν) that minimizes

Rn12xT(x)2dμ(x).\int_{\mathbb{R}^n} \frac{1}{2} |x - T(x)|^2\,d\mu(x).

Brenier’s theorem asserts that there is a unique μ-a.e. optimal map T of the form T = ∇u, where u: ℝⁿ → ℝ is convex. The uniqueness is in the μ-almost everywhere sense; any other optimal plan must coincide with the graph of T (Bonnotte, 2012, Gozlan et al., 2018).

On the torus 𝕋ⁿ, with cost cA(x,y)=12A(xy),xyc_A(x, y) = \frac{1}{2} \langle A(x-y), x-y \rangle for A ∈ Sₙ{++}, and under smoothness and positivity conditions on densities f=dμ/dxf = d\mu/dx, g=dν/dxg = d\nu/dx, there exists a unique c_A–concave potential ψ: 𝕋ⁿ → ℝ (with ∫ψ = 0) such that T(x) = x – A⁻¹∇ψ(x) pushes μ onto ν, and ψ ∈ C∞(𝕋ⁿ), A − D²ψ(x) > 0 everywhere (Bonnotte, 2012).

2. Duality, c-Concavity, and Cyclic Monotonicity

The optimal transport map’s existence and characterization leverage Kantorovich duality and the notion of c-concavity. For quadratic cost, the dual formulation is

supψ,χ{ψ(x)dμ(x)+χ(y)dν(y)ψ(x)+χ(y)xy2 x,y}.\sup_{\psi, χ}\left\{ \int \psi(x)\,d\mu(x) + \int χ(y)\,d\nu(y) \mid \psi(x) + χ(y) \leq |x-y|^2 \ \forall x, y \right\}.

There is always an optimizer of the form ψ = φ, χ = φ, with φ c-convex. Complementary slackness (φ(x) + φ(y) = |x – y|² for π⋆-almost every (x,y)) implies y ∈ ∂φ(x). Under absolute continuity of μ, this subdifferential reduces μ-a.e. to a single point, yielding T(x) = ∇φ(x) (Gozlan et al., 2018).

Cyclical monotonicity, as established by McCann, ensures that T is optimal. For torus costs, ψcc = ψ if and only if x12Axxψ(x)x \mapsto \frac{1}{2}Ax \cdot x - ψ(x) is convex. If A − D²ψ > 0, T is a diffeomorphism (Bonnotte, 2012).

3. Monge–Ampère Equation and Regularity Theory

The optimal transport map must satisfy the Monge–Ampère equation. In the standard case (A = I), this reads

detD2u(x)=f(x)g(u(x)),\det D^2 u(x) = \frac{f(x)}{g(\nabla u(x))},

with T(x) = ∇u(x). For general A, the formulation in terms of ψ is

f(x)=g(xA1ψ(x))det[IA1D2ψ(x)].f(x) = g(x - A^{-1} \nabla ψ(x))\,\det [I - A^{-1} D^2 ψ(x)].

The regularity theory asserts that if f, g ∈ C⁰∞_{>0} and all smoothness and positivity assumptions hold, then the Kantorovich potential ψ and the transport map T are smooth and T is a diffeomorphism (Bonnotte, 2012).

4. Evolution Under Degenerate Costs and Connection to Knothe’s Rearrangement

Introducing a family of anisotropic costs ct(x,y)=12k=1ntk1(xkyk)2c_t(x, y) = \frac{1}{2} \sum_{k=1}^n t^{k-1}(x_k - y_k)^2 (A_t = diag(1, t, ..., t{n−1})), one obtains a smooth evolution of the optimal map as the anisotropy parameter t → 0. The associated PDE for Kantorovich potentials ψ_t is encoded in

Div{f(x)[AtD2ψt]1[ψ˙tA˙tAt1ψt]}=0\mathrm{Div} \left\{ f(x)\,[A_t - D^2 ψ_t]^{-1}[∇\dot{ψ}_t - \dot{A}_t A_t^{-1} ∇ψ_t] \right\} = 0

[Equation (4.7) in (Bonnotte, 2012)].

In the vanishing limit t → 0, the cost penalizes the x₁-direction more heavily, resulting in the convergence (in L²) of the Brenier map T_t to the Knothe–Rosenblatt rearrangement R, which is formed by successive measures on conditional marginals. The decomposition

ψt(x)=ψt1(x1)+tψt2(x1,x2)++tn1ψtn(x1,,xn)ψ_t(x) = ψ_t^1(x_1) + t\,ψ_t^2(x_1, x_2) + \ldots + t^{n-1}ψ_t^n(x_1,\ldots,x_n)

aligns with a smooth evolution in t that yields, at t=0, the one-dimensional Kantorovich potentials associated with the Knothe rearrangement (Bonnotte, 2012).

5. Smooth Dependence on Cost Structures and Nash–Moser Theory

For families of costs parameterized by positive-definite matrices A_t, the map A ↦ ψ_A is C¹ from Sₙ{++} to C{n+2,α}(𝕋ⁿ). As t ↓ 0, existence and regularity of solutions to the coupled system for (ψ_t¹, ψ_t²) are guaranteed using the Nash–Moser inverse function theorem, handling loss of derivatives and ensuring C dependence on t ∈ [0, ε) (Bonnotte, 2012).

All linearizations of the nonlinear system exhibit isomorphic behavior with “tame estimates.” This framework recovers the Brenier structure for t > 0 and demonstrates degeneration of the map onto the Knothe rearrangement as t → 0.

6. Extensions: Weak and Barycentric Transport and Mixture Theorems

Generalizations of Brenier’s result appear for weakened quadratic costs, notably the barycentric cost, defined as

T2(νμ):=infπC(μ,ν)Rdypx(dy)x2μ(dx),\overline{T}_2(\nu\midμ) := \inf_{π\in C(μ,ν)} \int_{\mathbb{R}^d} \left|\int y\,p_x(dy) - x \right|^2 \, μ(dx),

with π(dx, dy) = μ(dx) p_x(dy). For such costs, every optimal plan decomposes into (1) a deterministic Brenier transport X ↦ X' = ∇φ(X), pushing μ onto its barycentric projection μ̄, and (2) a martingale coupling from μ̄ to ν (Gozlan et al., 2018).

A full equivalence is established: W_22(μ, ν) = \overline{T}_2(ν|μ) if and only if there is a convex φ ∈ C¹ with 1–Lipschitz gradient pushing μ onto ν, which links to Caffarelli’s contraction theorem. In cases where ν has density proportional to e{-V} with V convex relative to a reference Gaussian, the Brenier map satisfies strict contraction properties (Gozlan et al., 2018).

7. Impact, Applications, and Further Directions

Brenier’s theorem provides the canonical tool for generating optimal transport maps in numerous mathematical, computational, and applied domains. Its structure underlies the analysis and numerical computation of transport maps, with direct connections to partial differential equations (Monge–Ampère), metric geometry, and probability. The smooth continuation from Brenier maps to the Knothe–Rosenblatt rearrangement elucidates the role of the cost function’s anisotropy and its effect on optimal transport structure. These insights inform the analysis of degenerate cases, the regularity of solutions, and the construction of numerical algorithms exploiting the smooth dependence on transport cost parameters (Bonnotte, 2012).

Further implications include the characterization of weak transport costs via barycentric projections and their relation to deterministic and stochastic components of optimal plans, as articulated in the mixture theorem and its connections to contraction principles (Gozlan et al., 2018). This suggests both a deeper structural understanding of transport optimality and new potential computational strategies driven by degenerate-cost limits.


References:

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Brenier’s Theorem.