Piecewise Convexification Methods
- Piecewise Convexification is a methodology that approximates nonconvex problems by partitioning domains into convex segments to ensure computational tractability.
- The approach leverages techniques like strong smoothing, constrained spline fits, and dual formulations to achieve accurate estimation and efficient global optimization.
- Applications span nonparametric estimation, signal processing, global optimization, MINLP/MILP modeling, and statistical learning with significant computational improvements.
Piecewise convexification refers to a suite of methodologies that approximate, estimate, or relax nonconvex problems by replacing them with piecewise convex representations or constraints. The broad rationale is to leverage convexity in subregions—often identified via structural analysis or domain partitioning—so as to facilitate tractable optimization, statistical estimation, or function approximation with theoretically guaranteed properties. Piecewise convexification arises in nonparametric regression, global and multi-objective nonlinear programming, signal processing, combinatorial optimization, and convex analysis of piecewise quadratic functions. Across these domains, algorithms exploit convexity within segments or sub-domains to yield estimators, relaxations, and representations that balance flexibility, computational tractability, and fidelity to the original problem structure.
1. Piecewise Convex Fitting in Nonparametric Estimation
Piecewise Convex Fitting (PCF) addresses the challenge of estimating a smooth function while preserving the numbers and locations of convexity (inflection) change points. Given noisy data , with , and an unknown possessing points of -convexity change, PCF seeks an adaptive nonparametric estimator that enforces the correct geometric configuration of the estimated function (Riedel, 2018).
The method is fundamentally two-stage:
- Pilot detection by strong smoothing: Apply a highly smoothed spline (or kernel) estimator with an overlarge penalty parameter , suppressing spurious inflection points with high probability. Inflection change points are inferred as sign changes of the -th derivative.
- Constrained smoothing-spline fit: In neighborhoods localized about each empirical change point, retune the smoothing level (selecting by MSE minimization, e.g., via Generalized Cross-Validation) and impose convexity constraints of the form within confidence intervals. The resulting convex restriction can be encoded as a set of linear inequality constraints on spline coefficients, giving rise to a structured quadratic program.
This two-stage process achieves the minimax mean-squared error rate for estimation under Sobolev smoothness, suppresses artificial inflection points with probability tending to 1, yields consistent localization of change-points at rate , and is asymptotically efficient among all procedures recovering the correct number of change points. Downstream, PCF finds application in signal denoising, instantaneous frequency tracking, spectral estimation, and shape-constrained surface recovery.
2. Piecewise-Convex Spline Estimation: Variational and Duality Perspectives
Piecewise-convex spline estimation formalizes the imposition of convexity constraints at prescribed or estimated change points within penalized smoothing frameworks (Riedel, 2018). The variational problem for observations seeks
where denotes the class of functions having prescribed alternating-sign monotonicity of the -th derivative between change points .
The solution admits a finite-dimensional representation: kernel components anchored at the data points, polynomial nullspaces, and additional "kink" basis elements at the with sign-constrained coefficients. This collapses the infinite-dimensional optimization to a convex quadratic program with linear inequality constraints.
A robust loss (, $1
The dual formulation leverages Fenchel duality to yield a convex quadratic program in the data-fitting variables with linear constraints enforcing convexity. This dual QP is especially advantageous for large and can be efficiently solved with standard methods involving the RKHS smoother matrix and constraint matrix, with model selection (in ) guided by cross-validation or information criteria.
3. Piecewise Convexification in Global and Multi-Objective Nonconvex Optimization
Piecewise convexification is central in global optimization algorithms that seek tight relaxations of nonconvex programs, especially those with box constraints (Zhu et al., 2022, Zhu et al., 2022). The αBB (alpha-Branch-and-Bound) methodology constructs for a given box a convex underestimator , with selected to ensure convexity.
The domain is recursively subdivided:
- Sub-boxes where is already convex (in the αBB sense) are "frozen"; only nonconvex sub-boxes are divided further.
- The union of minimizers over all sub-boxes, filtered against the best objective value obtained, provides an -approximation to the global minimizer set.
This framework extends to multi-objective problems by constructing convex relaxations for each objective function on every sub-box and aggregating weakly efficient and efficient solution sets. The algorithmic structure ensures finite -covering of the Pareto or weakly Pareto front, and the inclusion theorems guarantee that the aggregate set approximates the globally nondominated solutions within a prescribed tolerance.
Empirically, classifying and freezing convex sub-boxes yields significant computational savings: 50–90\% reduction in subdivisions and 5–20× CPU speed-up compared to pure αBB enumeration, while recovering all minima or nondominated points in tested benchmark problems (Zhu et al., 2022, Zhu et al., 2022).
4. Piecewise Convexification Methods in Mixed-Integer (Non)Linear Programming
For mixed-integer nonlinear programming (MINLP) and mixed-integer linear programming (MILP), piecewise convexification is a relaxation and modeling tool to approximate nonlinear, nonconvex functions with tight and computationally efficient convex or convexified surrogates (Trindade et al., 2022, Birkelbach et al., 2023).
In MINLP, the sequential convex MINLP (SC-MINLP) technique exploits the partition of a separable nonconvex function into convex and concave intervals. The concave parts are replaced by their linear convex envelopes, while convex pieces are modeled exactly through specialized formulations. The theory distinguishes classical incremental models (IM), multiple-choice models (MCM), and convex-combination models (CCM). The perspective reformulation—embedding perspective cuts of the convex segments into the MCM/CCM—achieves the tightest possible continuous relaxations, whereas IM is strictly weaker unless the convex-concave pattern is trivial (Trindade et al., 2022).
For MILP, piecewise convex approximation (PwCA) divides a multivariate domain into two convex regions separated by a hyperplane, with each region approximated by a convex envelope of a small set of hyperplanes. The key advantage is that only one auxiliary binary variable is required to encode this structure, scaling far better than simplex-based piecewise-linearizations in high-frequency function embedding settings (e.g., unit commitment). Empirical results show substantial reductions in constraint counts and solving times, outperforming standard simplex or convex hull log-formulations (Birkelbach et al., 2023).
5. Piecewise Convexification for Explicit Convex Envelope Computation
In convex analysis, computing the closed convex envelope (biconjugate) of a nonconvex piecewise quadratic function is a foundational task (Kumar et al., 2021). For a piecewise quadratic defined on a polyhedral partition of , the convex hull over each subdomain is a rational function of the form
The dual convex conjugate of is analytically characterized on a parabolic subdivision of the dual space, taking the form of a linear, quadratic, or fractional expression. Each piece's conjugate can be computed in time, where is the number of edges, and an explicit formula is available for each parabolic cell. The global biconjugate (convex envelope) is then the supremum over all such conjugates per piece. This development enables explicit piecewise convexification and contributes to bridging nonconvex and convex domains for applications in global optimization and analysis.
6. Piecewise-Convexification in Statistical Learning
Piecewise-convexification underpins recent developments in regression and classification with difference-of-convex (DC) functions (Siahkamari et al., 2020). Here, a real-valued function is modeled as , with each a max-affine convex function. The regression is regularized by a seminorm controlling the -norm of subgradients of and , enforcing Lipschitz control. The empirical risk minimization problem can be cast as a single quadratic program in variables representing the affine parameters and witness variables at each data point.
This approach achieves nearly minimax statistical risk, with theoretically validated deviation and generalization bounds via empirical process theory and covering numbers for Lipschitz convex functions. Empirical evaluations on synthetic and real datasets show competitive or superior performance relative to established nonparametric and machine learning methods, alongside efficient computability in moderate dimensions and sample sizes.
7. Summary Table: Representative Piecewise Convexification Approaches
| Domain | Core Principle | Key Reference |
|---|---|---|
| Nonparametric estimation | Two-stage geometric constraint preservation | (Riedel, 2018) |
| Spline estimation | Variational, RKHS dual, shape constraints | (Riedel, 2018) |
| Global/bounded optimization | Box subdivision with convex relaxations | (Zhu et al., 2022) |
| MINLP/MILP modeling | Perspective cuts, binary-encoded convex pieces | (Trindade et al., 2022, Birkelbach et al., 2023) |
| Convex envelope computation | Piecewise rational conjugation, biconjugacy | (Kumar et al., 2021) |
| Regression/classification | Max-affine DC decomposition, convex QP | (Siahkamari et al., 2020) |
These diverse manifestations share the unifying philosophy of decomposing nonconvex structures into piecewise convex surrogates, enabling the application of convex optimization, regularization theory, or combinatorial relaxation at scale.