Convex Relaxation in Optimization
- Convex Relaxation (CR) is a method that replaces a nonconvex feasible set or energy function with a tractable convex outer approximation, providing rigorous lower bounds.
- Techniques like conic representations, lift-and-project, and convex envelopes are employed to construct tight relaxations that balance computational efficiency with solution accuracy.
- Applications including discrete tomography, graphical models, and variational imaging demonstrate CR’s effectiveness in obtaining near-optimal solutions for challenging nonconvex problems.
Convex Relaxation (CR) is a central paradigm for obtaining tractable surrogates to optimization and inference problems originally posed over discrete, combinatorial, or otherwise nonconvex sets. CR involves replacing an intractable feasible set or energy function with a convex outer approximation that supports efficient convex optimization, yielding rigorous lower bounds and, frequently, approximate or even exact solutions. Over the past two decades, CR has seen systematic development in combinatorial optimization, variational inference, inverse problems, statistical estimation, and graphical models. This article organizes the foundational mathematical principles, algorithmic constructions, theoretical guarantees, and key applications emerging from the convex-relaxation literature, with selected technical focus on tight relaxations for discrete tomography and nonconvex variational problems.
1. Mathematical Framework of Convex Relaxation
At its core, convex relaxation addresses an intractable (typically NP-hard) problem of the generic form
where is nonconvex (e.g. discrete labelings, binary vectors, or nonconvex manifolds) and is convex, linear, or structured. The convex relaxation substitutes with a convex set that is computationally tractable (e.g. admits LP/SDP representation, conic constraints, or efficient separation oracles), giving the relaxed problem
By construction, the solution to the relaxation provides a lower bound for minimization (upper bound for maximization). Hierarchies of such relaxations—of varying tightness—are a signature feature, allowing tradeoffs between runtime and approximation fidelity (Chandrasekaran et al., 2012).
Key representation techniques include:
- Conic representations: Expressing the relaxation as linear optimization over standard cones (nonnegative orthant, semidefinite cone).
- Lift-and-project: Auxiliary variables are introduced so that the original constraints are lifted into a higher-dimensional convex set, such as the local polytope or marginal polytope in graphical models.
- Convex hull or envelopes: For function relaxations, taking the convex envelope (e.g. as the biconjugate or via epigraph convexification) yields the tightest possible convex surrogate under pointwise domination (Azar et al., 2016, Möllenhoff et al., 2015).
2. Tight Convex Relaxation for Discrete Tomography
The discrete-tomography problem exemplifies the application of tight convex relaxation in structured inverse problems. Here, the goal is to reconstruct from a small set of linear measurements while minimizing a pairwise Markov random field (MRF) energy reflecting local smoothness priors: The novel relaxation developed in (Kuske et al., 2017) constructs a joint graphical model capturing both MRF dependencies and measurement constraints through high-order "ray" factors. Each ray factor enforces a measurement sum constraint, while pairwise potentials encode smoothness.
The relaxation introduces marginal variables over combinatorial versions of these rays via a dyadic decomposition and "counting-factor" marginals: subject to normalization and marginalization constraints. The resulting LP has sub-quadratic complexity , dramatically sharper than classical local-polytope plus moment moment approaches. Unlike separable relaxations, the counting-factor LP is exact on each 1D ray, closing the integrality gap for subproblems and yielding empirically tighter bounds for the global problem: on 400 test cases, it surpasses previous relaxations in at least 350 instances and produces certificates of optimality in a majority of cases (Kuske et al., 2017).
Optimization is realized via dual decomposition: each ray subproblem is solved to optimality by message passing, and Lagrange multipliers enforce agreement across shared variables, yielding a globally consistent solution upon convergence.
3. Convex Envelope and Relaxation for Nonconvex Energies
The concept of the convex envelope is a foundational tool for building the tightest possible convex relaxed objectives. For a function , its convex envelope is defined as the supremum of all convex underestimators of . This envelope is crucial for global optimization, since
and can be obtained via either epigraph convexification or by direct biconjugation. However, computing is generally as hard as solving the original problem exactly.
An algorithmic alternative, Convex Relaxation Regression (CoRR), fits a parametric convex family to random samples of , adjusting parameters to underfit and touch at optimal points. Given sufficient samples and suitable convex basis functions, CoRR produces solutions such that with high probability, linking statistical learning of convex envelopes directly to provable optimization guarantees (Azar et al., 2016).
In variational imaging, the "sublabel-accurate" convex relaxation constructs piecewise convex envelopes on intervals or simplices covering the label domain, ensuring tightness not just at sample points but across the entire range (sub-label accuracy). This local, intervalwise convexification achieves orders-of-magnitude finer solutions with much coarser label grids compared to piecewise linear relaxations, as demonstrated in vision problems such as stereo and denoising (Möllenhoff et al., 2015, Laude et al., 2016).
4. Higher-Order, Lifted, and Cone-Based Relaxations
Modern CR methods frequently adopt lifted formulations involving higher-order marginals and convex polytopes, especially in graphical models and polynomial optimization:
- In discrete tomography, higher-order rays aggregate pixel variables and define LPs over counting-factor marginals, exceeding the local polytope in tightness (Kuske et al., 2017).
- In global optimization of pairwise interaction energies, the bilinear energy is "lifted" by introducing the autocorrelation as a variable, subject to convex cone constraints (positivity, Fourier properties). The resulting LP over the cone of admissible autocorrelations provides global lower bounds and—when recovery is exact—global minimizers for the original nonconvex problem (Bandegi et al., 2015).
- In quadratic and polynomial optimization, convex hull relaxations (CHR), semidefinite programming (SDP), and copositive programming represent broad universes of relaxation. Equivalencies between CHR and SDP are established for systems of quadratic equations (Kalantari, 2019), while the copositive cone is shown to yield the convex envelope of quadratic objectives over polytopes, provided no direction of negative curvature exists in the recession cone. In contrast, DNN relaxations (PSD plus nonnegativity) may be strictly weaker or fail entirely in dimension (Yildirim, 2020).
5. Statistical and Computational Trade-Offs
Convex relaxation inherently mediates a trade-off between statistical risk and computational complexity in high-dimensional inference. In denoising and sparse estimation, one may progress through a hierarchy of relaxations (e.g. the nuclear norm, elliptope, cut polytope in matrix estimation), with weaker relaxations requiring slightly more data to achieve the same risk but affording dramatically cheaper runtime and larger-scale applicability (Chandrasekaran et al., 2012).
A precise theory relates the mean squared error to the Gaussian complexity of the tangent cone at the true parameter; this reveals which relaxations retain optimal rates and when approximation quality deteriorates. As a result, CR delivers both performance guarantees and a computational mechanism for scaling inference with increasing dataset size.
6. Algorithmic Realizations and Applications
CR has been realized algorithmically via standard convex solvers, specialized message passing, and modern primal-dual first-order methods:
- In dense CRFs, convex QP relaxations are solved efficiently using Frank-Wolfe and fast filtering; difference-of-convex and LP relaxations provide global energy guarantees that surpass mean-field inference, with practical runtimes comparable to or moderately above the baseline (Desmaison et al., 2016).
- In feature selection, multi-stage CR alternately reweights the penalty in a sequence of convex programs, emulating the effect of the nonconvex capped- regularizer. This approach achieves unbiased support recovery when standard Lasso fails, with theoretical guarantees on sample complexity and algorithmic runtime (Zhang, 2011).
- Community detection in stochastic block models employs SDP relaxations of the partition matrix. Dual and primal analyses yield sharp recovery thresholds, adaptivity to heterogeneity, and robustness to outliers (Li et al., 2018).
Empirical studies consistently demonstrate that, when relaxation tightness is augmented by higher-order, global, or cone-based constraints, the resulting approach achieves both stronger certificates and more accurate, often nearly optimal, solutions compared against standard local relaxations.
7. Practical Insights, Limitations, and Future Directions
CR methods are marked by their modularity—supporting diverse constraints (moment, marginal, sum, spectral), objective formulations (linear, quadratic, entropy), and algorithmic strategies (primal-dual, dual decomposition, first-order). High-order relaxations, while offering significant gains, introduce extra per-factor complexity, which can typically be controlled by bespoke data structures or compact block representations (e.g. dyadic trees in counting-factor LPs (Kuske et al., 2017)).
Limitations of CR remain in scaling to very high dimensions, especially where the number or order of higher-order marginals grows super-polynomially, or where the convex envelope is intractable to compute. Additionally, while relaxations typically provide lower bounds and approximate solutions, exactness or certificate of optimality is sometimes elusive (unless integrality gaps vanish by structure or by duality, as in counting-factor or copositive relaxations).
Ongoing research directions include adaptive relaxation strategies (interpolating between levels of the hierarchy in response to problem instance), efficient high-dimensional SDP solvers, analytic characterization of convex hulls in nonlinear settings, and extension of these frameworks to handle more general classes of nonlocal and stochastic optimization problems (Bandegi et al., 2015, Shao et al., 2017, Chen et al., 2023).
References:
- (Kuske et al., 2017, Bandegi et al., 2015, Möllenhoff et al., 2015, Azar et al., 2016, Zhang, 2011, Kalantari, 2019, Chandrasekaran et al., 2012, Li et al., 2018, Desmaison et al., 2016, Yildirim, 2020, Chen et al., 2023, Dymarsky et al., 2018, Laude et al., 2016, Joulin et al., 2012, Shao et al., 2017).
Each citation corresponds to an arXiv preprint containing the detailed technical development of the algorithms, proofs, and empirical validations referenced herein.