Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 25 tok/s Pro
GPT-5 High 26 tok/s Pro
GPT-4o 58 tok/s Pro
Kimi K2 194 tok/s Pro
GPT OSS 120B 427 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

PS-DCA: Proximal-Subgradient DC Algorithm

Updated 23 October 2025
  • PS-DCA is a hybrid optimization method combining proximal subgradient and one-step DC refinement, designed for fractional minimization of convex functions.
  • The algorithm strategically integrates a proximal subgradient step with a DC refinement to robustly navigate nonconvex landscapes and enforce unit-sphere constraints.
  • Empirical results show PS-DCA attains lower objective values and fewer iterations, effectively escaping local minima in spectral graph applications.

The Proximal-Subgradient-Difference of Convex Functions Algorithm (PS-DCA) is an iterative scheme for nonsmooth nonconvex optimization, specifically adapted to problems where the objective is a fractional or composite function expressible as the ratio (or difference) of convex, positively homogeneous functions. PS-DCA systematically integrates proximal subgradient steps with a refinement via a single step of the Difference-of-Convex (DC) Algorithm, yielding a hybrid iteration with improved global convergence properties and greater robustness against initial point sensitivity as compared to standard proximal-gradient approaches (Qi et al., 22 Oct 2025).

1. Algorithmic Structure and Problem Setting

PS-DCA is specifically formulated for single-ratio fractional minimization problems of the form

minxSΩE(x)=T(x)B(x),\min_{x \in \mathcal{S} \cap \Omega} E(x) = \frac{T(x)}{B(x)},

where T:XRT: X \to \mathbb{R} is convex and positive-homogeneous, B:XRB: X \to \mathbb{R} is convex, absolutely homogeneous, S\mathcal{S} is the unit sphere, and Ω={xB(x)0}\Omega = \{x \mid B(x) \neq 0\} (Qi et al., 22 Oct 2025). The algorithm exploits the equivalence between minimizing T(x)/B(x)T(x)/B(x) over xX{0}x \in X \setminus \{0\} and the constrained problem minT(x)\min T(x) subject to B(x)=1B(x) = 1. This "lifting" allows for the design of first-order algorithms that operate efficiently on high-dimensional signal processing problems, such as the computation of generalized graph Fourier modes.

The canonical PS-DCA iteration comprises two main steps:

  1. Proximal-Subgradient Step: Given an iterate xkx_k, compute a candidate

lk=proxλkT(xk+λkE(xk)vk),l_k = \operatorname{prox}_{\lambda_k T}(x_k + \lambda_k E(x_k) v_k),

where vkB(xk)v_k \in \partial B(x_k) is a subgradient and E(xk)=T(xk)/B(xk)E(x_k) = T(x_k)/B(x_k). This step approximately solves a regularized subproblem involving the current linearization of E(x)E(x).

  1. One-Step DCA Refinement: Improve the candidate lkl_k by performing one iteration of a DCA applied to the DC decomposition T(x)E(lk)B(x)T(x) - E(l_k) B(x) (plus regularization). Explicitly,

tk=proxT/ρk((E(lk)/ρk)wk),t_k = \operatorname{prox}_{T/\rho_k} \left( (E(l_k)/\rho_k) w_k \right),

where wkB(0)w_k \in \partial B(0) is adaptively chosen and ρk>0\rho_k > 0 is a regularization parameter.

The final update xk+1x_{k+1} is projected onto the sphere: $x_{k+1} = \frac{y_k}{\|y_k\|}, \qquad \text{where %%%%17%%%% if DCA step accepted, else %%%%18%%%%}.$

2. Mathematical Foundations and Optimality

The underlying optimality and convergence theory is rooted in DC programming and classical subdifferential calculus. For the fractional program, first-order stationarity at xx^* is characterized by the existence of vB(x)v^* \in \partial B(x^*) such that

x=proxλT(x+λE(x)v).x^* = \operatorname{prox}_{\lambda T}(x^* + \lambda E(x^*) v^*).

Such fixed-point characterization is a direct first-order optimality condition (Qi et al., 22 Oct 2025).

A point xx^* is a critical point if E(x)B(x)T(x)E(x^*) \partial B(x^*) \cap \partial T(x^*) \neq \varnothing. In particular, if E(x)B(0)T(0)E(x^*) \partial B(0) \subseteq \partial T(0), xx^* is a global minimizer. The refinement via the DCA step targets this criticality condition more directly compared to pure proximal-gradient approaches.

The DC refinement decomposes the regularized subproblem via

gk(x)=T(x)+12λkx2,hk(x)=E(xk)B(x)+12λkx2,g_k(x) = T(x) + \frac{1}{2\lambda_k}\|x\|^2, \qquad h_k(x) = E(x_k)B(x) + \frac{1}{2\lambda_k}\|x\|^2,

and uses a one-step DCA iteration.

3. Convergence Properties

Under the assumptions that the step size sequence {λk}\{\lambda_k\} and regularization parameters {ρk}\{\rho_k\} are bounded (both above and away from zero), and that TT and BB are convex and (positive/absolutely) homogeneous, the algorithm satisfies the following properties (Qi et al., 22 Oct 2025):

  • Monotonic Descent: E(xk)E(x_k) decreases strictly unless the iterate sequence has stabilized. Specifically,

E(xk)E(xk+1)+1λkB(lk)xklk2,E(x_k) \geq E(x_{k+1}) + \frac{1}{\lambda_k B(l_k)}\|x_k - l_k\|^2,

with an additional strict decrease term if the DCA refinement is active.

  • Compactness and Stalling: The sequence {xk}\{x_k\} remains on the unit sphere (thus bounded) and

xkxk+10.\|x_k - x_{k+1}\| \to 0.

  • Subsequential Convergence: Every accumulation point xx^* of {xk}\{x_k\} is a critical point, i.e., it satisfies the first-order condition above. Unless the set of cluster points is a nontrivial continuum, convergence is to a single point.

These monotonicity and stationarity conditions are robust to initialization and are maintained even when the DCA step is omitted (yielding the proximal-subgradient algorithm, PSA).

4. Relation to Proximal and DC Algorithms

The PS-DCA algorithm generalizes elements of proximal subgradient splitting (Cruz, 2014), leveraging splitting techniques where both T and B can be nonsmooth and only accessible via subgradient/proximal oracles. The DC refinement step is akin to a single iteration of the classical DCA (Banert et al., 2016), but tightly integrated within each main algorithmic loop. This fusion enables PS-DCA to better navigate nonconvex landscapes and avoid the pronounced local minima trapping observed for standard proximal-gradient schemes.

From an optimality perspective, PS-DCA's convergence theory and update structure inherit ideas from more general inexact DC frameworks (Zhang et al., 2023) and recent convergence rate analyses for DC-type splitting (Rotaru et al., 6 Mar 2025, Rotaru et al., 2 Jun 2025, Abbaszadehpeivasti et al., 18 Oct 2025), though (Qi et al., 22 Oct 2025) does not develop explicit complexity rates.

5. Numerical Experiments and Empirical Observations

Extensive experiments on the computation of generalized graph Fourier modes (GFMs), a canonical application area for fractional programs with DC structure, demonstrate the practical efficiency of PS-DCA (Qi et al., 22 Oct 2025). Compared to proximal-gradient (PGSA) and manifold proximal gradient algorithms (ManPG-Ada), PS-DCA consistently attains lower objective values, converges in fewer iterations, and is markedly less sensitive to initialization.

Key points in the design of the numerical studies include:

  • Graph types: community graphs, random geometric graphs (RGG), directed RGGs.
  • Fractional objective expresses directional variation over normalization, with normalization constraints enforced via projection or subspace orthogonality.
  • Empirical results show that the DCA step can escape low-quality local minima that trap proximal-gradient-only methods, yielding globally superior solutions for a fixed computational budget.

A summary table (abridged for clarity):

Method Avg. Obj. Value Avg. Iter. CPU Time (s)
PS-DCA (Q=I) Lower Fewer Lower
PGSA (Q=I) Higher More Higher
ManPG-Ada (Q=I) Higher More Higher

6. Applications and Implications

The principal application discussed is in spectral graph theory, specifically, the computation of generalized graph Fourier modes (including Laplacian eigenmaps, directed total variation, and sparse variation models). The PS-DCA framework applies broadly to signal processing, clustering, and graph-based dictionary learning, wherever fractional DC optimization arises over (sub)manifold constraints or normalization sets.

By incorporating an explicit DC step within a proximal-subgradient backbone, PS-DCA allows robust handling of:

  • Nonsmooth variation measures (including l₁, total variation-type objectives).
  • Hard normalization constraints, including manifolds.
  • Cases where starting points are far from the global minima.

7. Significance and Future Directions

PS-DCA unifies classical proximal splitting with DC decomposition, producing an algorithm less prone to initialization sensitivity and local-minimum stagnation than pure gradient-based competitors. The nuanced use of a “one-step DCA” within each iteration distinguishes it from typical first-order solvers and draws on a rich literature in DC programming and proximal methods (An et al., 2015, Wen et al., 2016, Abbaszadehpeivasti et al., 18 Oct 2025).

Open directions include explicit complexity rate analysis (e.g., under Polyak–Łojasiewicz or Kurdyka–Łojasiewicz conditions), extension to more general constraint sets, and adaptive integration of inexact or stochastic subgradient/proximal oracles in large-scale nonconvex machine learning settings.


Key References for Further Study:

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Proximal-Subgradient-Difference of Convex Functions Algorithm (PS-DCA).