Kelley Cutting Plane-Like Method

Updated 18 November 2025

Kelley Cutting Plane-Like Method is an iterative optimization algorithm that generates linear cuts to approximate the feasible region and improve solutions.
It employs subgradient and gauge reformulation techniques, integrating analytic center updates and projection methods to enhance both convex and mixed-integer models.
Modern extensions use trust-region constraints and projected-gradient heuristics to reduce computational time and achieve precise convergence in high-dimensional problems.

The Kelley Cutting Plane-like Method refers to a class of iterative optimization algorithms that generate and aggregate linear (or affine) cuts to approximate and ultimately solve convex and certain nonconvex optimization problems. These methods, derived from the classical scheme by J. E. Kelley, maintain outer polyhedral models of the feasible region or epigraph of the objective, iteratively refine these via cutting planes based on subgradient information, and solve updated master problems for improved solutions. Contemporary cutting-plane approaches generalize the scheme to more complex settings, including mixed-integer programs, nonsmooth convex optimization, and specific nonconvex problems under minimal assumptions.

1. Historical Foundations and General Framework

Kelley's original cutting-plane algorithm formalized an iterative process for continuous convex optimization by maintaining a polyhedral outer approximation to the epigraph of the objective function. Each iteration solves a master problem over this region, receives subgradient (or gradient) information at the current candidate solution, and adds a cut that excludes regions inconsistent with convexity. The general template is:

At each iteration $k$ , collect points $x^0, \ldots, x^k$ and corresponding (sub)gradient data.
Solve a linear or mixed-integer program over the polyhedron defined by all existing cuts.
Evaluate constraint or optimality at this solution; if not satisfactory, generate a new cut based on subgradient information and append it to the model.
Repeat until the feasible region is approximated to desired tolerance.

This process is the basis for a wide array of outer approximation schemes in modern optimization, including those for mixed-integer nonlinear programming (MINLP), nonsmooth convex optimization, and copositive programming (Serrano et al., 2019).

2. Gauge Reformulation and Extended Supporting Hyperplane Algorithm

A significant development connects the Kelley approach to gauge-based formulations. For a convex set $C$ , the gauge function $\varphi_C(x) = \inf\{t>0 : x \in t C\}$ is convex and positively homogeneous, providing the representation $C = \{x : \varphi_C(x) \leq 1\}$ .

The Extended Supporting Hyperplane (ESH) algorithm formalizes cut generation by projecting candidate points onto the boundary $\partial C$ via a line search from an interior point. Cuts are then formed at boundary points using gradient information:

At iteration $k$ , solve the relaxation to obtain $x^k$ .
Line search along $x^k$ from an interior point until $\max_j g_j(x) = 0$ , creating a projection $\tilde{x}^k$ on $\partial C$ .
Form a gradient cut $g_j(\tilde{x}^k) + \nabla g_j(\tilde{x}^k)^\top(x - \tilde{x}^k) \leq 0$ for any active $j$ .
Add the cut to the polyhedral master problem and repeat.

These steps correspond precisely to Kelley’s method when reformulated on the gauge constraint $\varphi_C(x) \leq 1$ . The equivalence proof demonstrates that each supporting cut generated at the boundary corresponds to a Kelley-type cut at a boundary linearization of the gauge function (Serrano et al., 2019).

3. Supporting Hyperplane Geometry, Cut Characterization & Equivalence Theorems

Cutting-plane efficacy critically depends on where cuts are generated:

A gradient cut at $y$ for $g$ convex provides the tangent hyperplane $g(y) + \nabla g(y)^\top(x-y) \leq 0$ , valid for the set $\{g \leq 0\}$ but not necessarily supporting the boundary.
The geometric criterion for a supporting cut is the existence of $x_0 \in \{g \leq 0\}$ such that the function $\lambda \mapsto g(x_0 + \lambda(y-x_0))$ is affine on $[0, 1]$ . Sublinear (homogeneous) $g$ guarantees all gradient cuts are supporting.
For gauge-reformulated sets, boundary projection ensures each cut strictly separates the current iterate from $C$ , and rescaling the gradient yields cuts equivalent to the Kelley linearization of the gauge.

These geometric insights ensure Kelley-type and ESH-type schemes retain finite convergence, supporting both continuous and mixed-integer convex programs (Serrano et al., 2019). A plausible implication is that precise boundary projection can considerably tighten the polyhedral outer model compared to arbitrary local cuts.

4. Extensions: Nonsmooth, Analytic Center, and Nonconvex Variants

Recent work generalizes Kelley-like algorithms beyond traditional convexity:

Optimal Kelley-like Methods for Nonsmooth Convex Optimization: By incorporating trust-region constraints and auxiliary variables in the master problem, optimal variants attain the rate $O(1/\sqrt{N})$ for Lipschitz convex minimization, outperforming the classical method's instability (Drori et al., 2014).
Analytic Center Cutting Plane Methods (ACCPM): The analytic center of the current polyhedral approximation is selected as the next center, not arbitrary trial points. This approach dramatically reduces the number of expensive oracle calls (e.g., for copositivity), achieving empirical scaling as $O(d^2)$ for matrix dimensions, versus much slower ellipsoid-based methods (Badenbroek et al., 2020).
Gradient-based Heuristics and Projections: For discrete domains, projected-gradient methods are used to identify feasible points closer to the optimal solution, after which MILP-based projection and cut generation efficiently shrink the search region. Empirical tests on binary quadratic programs confirm roughly 80% reductions in MILP solves and consistent exact optimality within time limits (Bùi et al., 1 Nov 2025).

These extensions maintain the Kelley structure—solve, cut, aggregate—but replace the center and cut-generation steps with locally or globally enhanced proxies suited for modern high-dimensional or discrete problems.

5. Mild Conditions for Convergence and Nonconvex Problems

Finite convergence of Kelley Cutting Plane-like methods is guaranteed under mild differentiability assumptions:

At boundary points $x \in \partial C$ , gradients of all active constraint functions must be nonzero.
No strict convexity is required for individual constraints; convexity of zero-sublevel sets suffices.
Nonconvex representations are permissible provided the zero-sublevel sets are convex and the gradient-on-boundary condition holds.

Moreover, cutting-plane methods can be extended for nonconvex optimization via the "convex until proven guilty" principle. By repeatedly constructing trust-region subproblems, verifying local convexity by the behavior of cuts and volume reduction, and exploiting directions of negative curvature, these generalized schemes achieve $O(\varepsilon^{-4/3})$ expected total runtime to find $\varepsilon$ -stationary points, improving upon cubic regularization's $O(\varepsilon^{-3/2})$ scaling (Hinder, 2018).

6. Computational Performance and Implementation Guidelines

Practical deployment of Kelley-like cutting-plane methods requires attention to several aspects:

Choice of the interior point for projection is crucial; analytic centers or simple feasible solutions improve early relaxation quality.
Limited, accurate line searches suffice to find boundary projections for supporting cuts.
Early iterations favor LP relaxations, while later stages may require MILP cut management after introducing integrality constraints.
Empirical tests report dramatic reductions in the number of cuts, improved warm starts, and faster relaxation in convex and mixed-integer nonlinear solvers (Serrano et al., 2019). Analytic center updates further enhance cut utility, reducing costly oracle calls (Badenbroek et al., 2020).
Manage cut pool size by eliminating redundant or inactive cuts—keeping only those with nonzero dual multipliers in recent master solves.
Projected-gradient heuristics, when integrated, yield significant speed-ups and improved solution fidelity, especially in large-scale discrete problems (Bùi et al., 1 Nov 2025).

7. Impact on Optimization Methodology and Connections

Kelley Cutting Plane-like methods underpin a wide variety of modern approaches in convex, mixed-integer, nonsmooth, and nonconvex optimization. Their flexibility arises from the ability to incorporate structural reformulations (gauge, support functions), analytic centers, gradient-based heuristics, and trust-region or bundle modifications. The equivalence proofs between ESH and Kelley’s method on gauge-reformulated sets anchor their theoretical guarantees. They have demonstrable advantages over classical approaches such as ellipsoid methods, and modern instantiations are competitive with or superior to cubic regularization in nonconvex settings for certain parameter regimes.

Contemporary research suggests ongoing improvements in solver performance, generalizability to broader discrete domains, and hybridization with first-order and heuristic local search subroutines. Given their foundational role and robust convergence properties, Kelley-like cutting-plane methods remain a central computational tool in mathematical optimization.