Infimal Convolution Cost in Convex Analysis
- Infimal convolution cost is a mathematical framework in convex analysis that blends multiple cost functions or regularizers through variational minimization.
- It is applied in imaging, optimal transport, and learning to decompose energies and model multiple noise types or structural priors.
- The method preserves convexity and duality, facilitating efficient decomposition and the recovery of structured solutions in complex inverse problems.
The infimal convolution cost is a construct originating in convex analysis that allows the blending of multiple cost or regularization structures via a variational minimization. This cost structure appears under diverse guises across mathematical optimization, partial differential equations, image processing, statistical mechanics, optimal transport, and learning theory. Formally, the infimal convolution of two (extended-real) functions on a vector space is the function defined by
This operation is used to construct composite fidelity terms in variational models, dualizes to sums in convex conjugacy, and defines metric and probabilistic costs in unbalanced optimal transport. Its use as "cost" is particularly crucial when the underlying mathematical problem, such as regularization, transport, or duality, requires the joint modeling or interpolation of multiple, potentially antagonistic, effects.
1. Definition and General Construction
Infimal convolution cost arises by minimizing over all possible splittings of a variable into two (or more) components, penalizing each component by an associated cost, and aggregating through infimum. For , the prototypical definition is
In the context of optimal transport and metric geometry, the corresponding construction for (pseudo-)distances on is
which is the "one-step" or "metric infimal convolution." Higher-step variants (iterated infimal convolutions) can be defined for -step paths as
The infimal convolution preserves convexity and, under suitable conditions, regularity and coercivity. A fundamental property is its duality: for proper convex , , where denotes the Fenchel conjugate (Mahmudov, 2019).
2. Applications in Variational Regularization and Imaging
Infimal convolution costs have become central in advanced regularization schemes for inverse problems, image denoising, and signal processing. In such problems, the energy to minimize typically splits into a data fidelity (to the observed data ) and one or several regularization functionals enforcing desired smoothness or sparsity. The infimal convolution allows formulating composite regularizers that interpolate between different structural priors.
A major example is the TVL family (Burger et al., 2015, Burger et al., 2015):
| Model | Regularizer Formulation |
|---|---|
| TV–L Infimal Convolution | |
| TVL |
The infimal convolution splits the (distributional) derivative into a possibly unbounded and a bounded part, regularizing each, and leads to models that interpolate between standard total variation (TV), Huber-type TV, and total generalized variation (TGV). This approach achieves elimination of artifacts such as staircasing and better preservation of complicated geometric structures (Burger et al., 2015, Burger et al., 2015, Gao et al., 2017).
In imaging denoising for mixed noise, the data discrepancy itself is constructed as an infimal convolution to capture, e.g., both salt-and-pepper and Gaussian noise:
where and penalize, respectively, the different noise types (Calatroni et al., 2016).
3. Metric Structures and Optimal Transport
The infimal convolution cost plays a key role in the modern theory of unbalanced optimal transport. The celebrated result is the metric infimal convolution decomposition of the Hellinger–Kantorovich (HK) distance on nonnegative measures (Ponti et al., 17 Mar 2025). Given Hellinger (Fisher–Rao) and Wasserstein distances on measures, the squared HK distance is given by
with suitable extensions to multi-step (iterated) convolutions and dual forms. This structure is exact in the sense that minimizing paths for the HK geodesic flow correspond to sequences of Hellinger and Wasserstein updates (Ponti et al., 17 Mar 2025).
In multi-marginal optimal transport, the infimal-convolution cost
serves as the cost function in both static and dynamical formulations. The associated Benamou–Brenier dynamical version consists of coupled continuity equations whose common initial measure is the Wasserstein barycenter. The equivalence between the static infimal convolution MMOT and the dynamical barycenter–flow formulation is now established (Krannich, 14 Dec 2025).
4. Duality, Conjugation, and the Infimal Convolution Inequality
Infimal convolution naturally arises in convex duality, particularly the Fenchel–Rockafellar theorem and subdifferential calculus. For proper convex ,
is independent of further interiority constraints (Mahmudov, 2019).
In probability theory, the convex infimal-convolution inequality (ICI) provides concentration and moment inequalities for random vectors via
where is an optimal cost function, typically a dilation of the Legendre transform of the log-Laplace (Strzelecka et al., 2017).
In analysis on metric graphs, the infimal convolution operator generates semigroups satisfying discrete analogues of Hopf–Lax evolution and Hamilton–Jacobi equations, and thereby establishes equivalences between hypercontractivity, (modified) log–Sobolev inequalities, and transport–entropy inequalities (Shu, 2015).
5. Statistical Inference and Learning
Infimal convolution costs have been leveraged in statistical estimation for robust regression losses and penalization. In functional output regression, the infimal-convolution of the squared norm and a convex function generates Huber-type and -insensitive losses:
$H^p_\kappa := \tfrac{1}{2} \|\cdot\|_Y^2 \infconv \kappa \|\cdot\|_p,$
with closed-form expressions involving -norm projections and yielding losses that combine quadratic and linear penalties for outlier tolerance (Lambert et al., 2022). The dual forms become tractable, and the approach provides a spectrum from pure quadratic to robust or sparsity-promoting depending on the infimal convolution's secondary component.
In tropical and max-product inference for graphical models, the (min,+) infimal (max-)convolution for sequences
$(f \Boxdot g)(z) = \max_{x+y=z} f(x) + g(y)$
enables fast algorithms for MAP inference in convolution tree structures and specialized hidden Markov models, critically reducing computational cost via -norm approximations and FFT-based convolution (Serang, 2015).
6. Abstract, Infinite, and High-Dimensional Generalizations
The abstract form of the infimal convolution extends to infinite-dimensional or parametrized families of one-homogeneous convex functionals, leading to "infinite infimal convolution" regularizers:
with a sparse representer theorem derived for finite-dimensional measurements and generalized conditional gradient (off-the-grid) optimization schemes. These extensions admit regularizers capable of adapting smoothness and anisotropy, with provable well-posedness, coercivity, and convergence guarantees (Bredies et al., 2023).
7. Significant Theoretical and Practical Properties
Key theoretical properties of the infimal convolution cost include:
- Preservation of convexity: of convex lsc functionals remains convex lsc.
- Coercivity: If the original functionals are coercive, so is their infimal convolution for mixtures with finite-dimensional nullspaces (Gao et al., 2017, Burger et al., 2015).
- Decomposability and structure recovery: The cost enables splitting the solution into interpretable components, e.g., noise types in imaging (Calatroni et al., 2016), or texture/frequency directions in oscillatory TGV (Gao et al., 2017).
- Duality and exactness: The critical role in strong duality results and the construction of explicit dual problems in convex control, regularization, and path-space problems (Mahmudov, 2019, Shu, 2015).
In summary, infimal convolution costs provide a universal, rigorous approach to synthesizing complex energies across convex analysis, transport theory, statistical estimation, and image regularization, with extensive structural, dual, computational, and interpretability benefits.