Difference-of-Convex Algorithm (DCA) Advances

Updated 23 October 2025

Difference-of-Convex Algorithm (DCA) is a method that decomposes a nonconvex function into two convex parts, enabling efficient iterative minimization.
Boosted variants (BDCA) use backtracking and quadratic interpolation to significantly accelerate convergence, reducing iterations and computational time.
Convergence analysis via the Łojasiewicz property provides rigorous rate guarantees, making DCA applicable to large-scale problems like biochemical network steady-state analysis.

A difference-of-convex (DC) algorithm, often abbreviated as DCA, is a structured iterative method designed for the minimization of functions that are explicitly represented as the difference of two convex functions. Its relevance spans nonconvex optimization, particularly where the nonconvexity is “tame” in the DC sense and admits efficient convex minorization. Recent research has led to substantial advances, including algorithmic accelerations, refined convergence analysis via the Łojasiewicz property, rigorous rate guarantees, and biologically grounded applications such as biochemical network analysis.

1. Classical DCA and Algorithmic Acceleration

The standard DCA operates on problems with objective $\varphi(x) = f_1(x) - f_2(x)$ , where $f_1$ and $f_2$ are convex, smooth functions of $x\in\mathbb{R}^n$ . At iteration $k$ , the concave part $-f_2$ is replaced by its affine majorant at $x_k$ , generating a surrogate convex program: $\min_x \left\{ f_1(x)+\frac{\rho}{2}\|x\|^2 - \langle \nabla h(x_k), x\rangle \right\}$ where quadratic regularization (parameterized by $\rho>0$ ) ensures strong convexity. Its minimizer $y_k$ serves as the next iterate ( $x_{k+1}=y_k$ ).

Two “Boosted DCA” (BDCA) variants (Artacho et al., 2015) are introduced to accelerate this process:

BDCA with Backtracking: After computing $d_k=y_k-x_k$ , a line search is performed along $d_k$ from $y_k$ , seeking a step size $\lambda_k$ subject to the Armijo-type condition: $\varphi(y_k+\lambda_k d_k) \leq \varphi(y_k) - \alpha\lambda_k\|d_k\|^2$
BDCA with Quadratic Interpolation and Backtracking: Here, a quadratic model $\varphi_k(\lambda)$ is constructed, utilizing $\varphi(y_k)$ , its directional derivative, and $\varphi(y_k+\bar\lambda d_k)$ . The quadratic minimizer $\lambda^*_k$ is prioritized, followed by backtracking if needed.

Both algorithms consistently yield larger per-iteration decreases in $\varphi$ than classical DCA by exploiting that $d_k$ is a descent direction for $\varphi$ evaluated at $y_k$ : $\langle \nabla\varphi(y_k), d_k \rangle \leq -\rho \|d_k\|^2$

2. Theoretical Convergence Properties

Under standard assumptions—local Lipschitz continuity of $\nabla g$ , $\varphi$ bounded below, and in particular the Łojasiewicz property—the BDCA variants are globally convergent. The Łojasiewicz property ensures that for some $\theta\in[0,1)$ , $M>0$ : $|\varphi(x) - \varphi(x^*)|^\theta \leq M\|\nabla\varphi(x)\|$ in a neighborhood of a critical point $x^*$ . This enables the establishment of convergence rates for $x_k\to x^*$ and for the objective sequence. Specifically:

$\theta=0$ : finite-step convergence.
$0<\theta\leq 1/2$ : linear convergence.
$1/2<\theta<1$ : sublinear convergence, quantified as

$\|x_k - x^*\| = O\left(k^{-(1-\theta)/(2\theta-1)}\right)$

The rate analysis is established via an energy decrement lemma for sequences $s_k^\alpha \leq \beta(s_k-s_{k+1})$ with $s_k = \varphi(x_k)-\varphi(x^*)$ .

3. Implementation for Smooth DC Problems

For smooth and strongly convex settings relevant to biochemical networks, the implementation proceeds as follows:

Initialize $x_0$ in the feasible set.
At iteration $k$ $k$ :
- Compute $\nabla h(x_k)$ .
- Solve the strongly convex surrogate for $y_k$ .
- Set $d_k = y_k - x_k$ .
- If $d_k=0$ , stop.
- Else, perform line search (with or without quadratic interpolation) for step $\lambda_k$ .
- Update $x_{k+1} = y_k + \lambda_k d_k$ .

The quadratic subproblem for $y_k$ and the line search can be implemented with standard convex optimization techniques. The extra computational cost over vanilla DCA is dominated by additional function evaluations for the line search. In settings with analytic or closed-form gradients (as in biochemical kinetics), this can be efficiently vectorized.

Pseudocode outline:

def bdca_step(xk, grad_h, g, rho, line_search_params):
    grad_h_k = grad_h(xk)
    yk = solve_convex_subproblem(g, grad_h_k, rho)  # e.g., via Newton or CG
    dk = yk - xk
    if np.linalg.norm(dk) < tol:
        return yk, True
    lam = initial_stepsize(line_search_params)
    # Armijo or quadratic interpolation line search
    while not armijo_condition(yk, dk, lam, ...):
        lam *= reduction_factor
    xk1 = yk + lam * dk
    return xk1, False

Memory and computational complexity are dictated by the choice of convex solver for subproblems and the cost of function/gradient evaluations.

4. Numerical Performance and Biochemical Network Application

The BDCA is applied to biochemical network steady-state problems, formulated via a logarithmic transformation ( $x = \log u$ for concentrations $u$ ), resulting in real analytic, hence Łojasiewicz, objective functions. Each coordinate update involves convex operations on sums of exponentials and linear terms determined by stoichiometry and kinetics.

Empirical results (Artacho et al., 2015):

Average iteration counts reduced by factor $\approx5$ .
Objective function decrease reaches targets in $\approx4\times$ less computational time compared to DCA.
Scaling from hundreds to thousands of variables remains tractable.
In each tested network (e.g., Ecoli_core, large-scale human metabolism), BDCA trajectories advance faster towards steady state.

5. Parameter Selection, Limitations, and Extensions

The parameter $\alpha$ in the Armijo condition should be chosen below (but close to) the strong convexity constant to avoid null steps ( $\lambda=0$ ). The quadratic interpolation variant may require bounding $\lambda_k$ above by $\lambda_\text{max}$ for robustness against overestimations in nonquadratic settings.

Computational bottlenecks can arise for massive-scale networks if the convex subproblem solver is inefficient, but sparsity in the model (as in stoichiometric matrices) can be exploited. Careful vectorization and exploitation of analytic structure in $g$ and $h$ further enhances scalability.

Extensions to constrained and nonsmooth settings, e.g., incorporating linearly constrained DC programs, can be handled as in (Artacho et al., 2019) with appropriate modifications for feasibility at each step.

6. Significance in Broader Optimization Research

The acceleration analysis is situated within a wider context of DC programming for handling nonconvex and duplomonotone equations (cf. Aragón Artacho and Fleming, 2015, Optim. Lett.). The methodology is not restricted to biochemical models but is applicable wherever the objective possesses the required analytic and convex structure. This includes machine learning, sparse regression, and robust statistics, subject to appropriate DC reformulations.

The explicit reduction in iteration complexity and strong theoretical underpinnings position BDCA as a practically superior alternative to vanilla DCA for smooth DC programs exhibiting the Łojasiewicz property, making it a method of choice for practitioners facing large-scale smooth nonconvex optimization tasks with known DC structure.

PDF Markdown Chat (Pro)

References (2)

Accelerating the DC algorithm for smooth functions (2015)

The Boosted DC Algorithm for linearly constrained DC programming (2019)

Follow Topic

Get notified by email when new papers are published related to Difference-of-Convex Algorithm (DCA).