Self-adaptive Nonmonotone Subgradient Method

Updated 23 October 2025

SNSM is a generalized iterative method for unconstrained minimization of nonsmooth, nonconvex functions in the upper‑C² class.
It features nonmonotone Armijo-type line searches, self-adaptive parameter tuning, and flexible descent direction choices including Newton and quasi-Newton strategies.
Empirical results indicate that SNSM offers improved convergence and efficiency for clustering and mixed-integer optimization compared to classical methods.

The Self-adaptive Nonmonotone Subgradient Method (SNSM) is a generalized iterative optimization framework designed for unconstrained minimization of nonsmooth, nonconvex functions—specifically, those belonging to the upper‑ $\mathcal{C}^2$ class. This class comprises functions that locally admit a nonsmooth descent lemma analogous to the smooth quadratic upper bounding property found in smooth optimization. Distinguished by its use of nonmonotone Armijo-type line searches, automatic parameter adaptation, and flexible incorporation of descent directions (including Newton or quasi-Newton steps), SNSM is positioned as a robust alternative to classical monotone descent schemes for a wide range of problems, including minimum sum-of-squares clustering (MSC) and quadratic optimization with integer constraints (Aragón-Artacho et al., 22 Oct 2025).

1. Theoretical Foundations: Upper‑ $\mathcal{C}^2$ Functions and Local Descent Lemma

SNSM is formulated for objectives $\varphi:\mathbb{R}^n \to \mathbb{R}$ that are upper‑ $\mathcal{C}^2$ —i.e., each such $\varphi$ can locally be represented as the minimum of a parametric family of twice continuously differentiable functions. Formally, around any point $\bar{x}$ , there exists a compact set $C$ such that

$\varphi(x) = \min_{c \in C}\left\{\kappa \|x\|^2 - \langle a(c), x \rangle - b(c)\right\},$

where $a(\cdot)$ and $b(\cdot)$ are continuous functions, and $\kappa = \kappa(\bar{x}) \geq 0$ is a local curvature constant.

This structure forces $\varphi$ to satisfy a nonsmooth descent lemma: for any $x$ in a suitable neighborhood $V$ and any $w \in \partial \varphi(x)$ ,

$\varphi(y) \leq \varphi(x) + \langle w, y - x \rangle + \kappa \|y - x\|^2 , \quad \forall y \in V.$

This surrogate of the classic smooth descent lemma underpins the logic of line search acceptance and forms the backbone of the convergence analysis.

2. Algorithm Structure: General Nonmonotone Subgradient Scheme

At each iteration $k$ , SNSM proceeds as follows (see Algorithm 1 in (Aragón-Artacho et al., 22 Oct 2025)):

Subgradient Selection: Compute $w_k \in \partial \varphi(x_k)$ . If $w_k = 0$ , the point is stationary, and the algorithm terminates.
Descent Direction Choice: Select $d_k$ such that $\langle w_k, d_k \rangle < 0$ . The direction can be constructed via first-order (steepest descent), second-order (Newton-type), or quasi-Newton strategies.
Nonmonotone Armijo Line Search:
- Accept a trial step size $\tau_k$ if
$\varphi(x_k + \tau_k d_k) \leq \max_{i=[k-m_k]^+,\dots,k}\{\varphi(x_i)\} + \sigma \tau_k \langle w_k, d_k \rangle,$

where $[k-m_k]^+$ denotes the start index of the memory window (with $m_k \geq 0$ ), and $\sigma \in (0,1)$ is the Armijo parameter. - If the condition fails, $\tau_k$ is reduced by multiplication with $\beta \in (0,1)$ until acceptance.
Iterate Update: Set $x_{k+1} = x_k + \tau_k d_k$ .

This flexible acceptance, by considering the maximum function value over a sliding window of the last $m_k$ iterates, enables exploratory steps that are not strictly descending—facilitating escape from shallow local minima and nonsmooth kinks.

3. Automatic Parameter Adaptation

SNSM as specialized in Algorithm 2 (Aragón-Artacho et al., 22 Oct 2025) features full self-adaptation of critical line search parameters:

Trial Step Size ( $\bar{\tau}_k$ ): Increased if initial step sizes are repeatedly accepted; reset or decreased otherwise. Specifically, if two consecutive initial trial step sizes are accepted, set $\bar{\tau}_{k+1} = \gamma \tau_k$ with $\gamma > 1$ .
Memory Window ( $m_k$ ): Adapted dynamically. If the initial line search fails, $m_k$ is incremented (up to a fixed $m$ ) to relax the acceptance criterion. This process ensures the nonmonotone condition remains attainable while controlling conservativeness.
Update Mechanics: When a step reduction occurs, reset $\bar{\tau}_{k+1}$ to $\max\{\tau_k, \tau_{\min}\}$ and adapt $m_{k+1}$ to the minimum value sufficient for acceptance.

This dynamic tuning enables the method to automatically balance between exploration—via increased step sizes and nonmonotonicity—and exploitation by returning to conservative behavior when indicated by recent progress.

4. Applications: Minimum Sum-of-Squares Clustering and Mixed Integer Optimization

A key motivating example is the Minimum Sum-of-Squares Clustering (MSC) problem, specified by:

$\varphi_{\text{MSC}}(X) = \frac{1}{p} \sum_{j=1}^p \omega^j(X), \quad \omega^j(X) := \min_{t}\|x^t - a^j\|^2,$

where $X \in \mathbb{R}^{s \times \ell}$ denotes cluster centroids, and $\{a^j\}_{j=1}^p$ are data points. $\varphi_{\text{MSC}}$ is nonsmooth due to the minimization over $t$ .

Subgradient Calculation: For each $a^j$ , determine the active centroid $t_j$ yielding the minimum, then form the associated subgradient.
Newton-like Direction: Lock the active indices, compute local differentiable approximations, and aggregate block-diagonal Hessians to generate a regularized Newton direction $d_k = -[\nabla^2 \varphi^k_{\text{MSC}}(X_k) + \alpha_k I]^{-1} w_k$ .

The algorithm is also applicable to quadratic optimization with integer constraints, where the feasible set consists of unions of small balls emulating integrality.

5. Numerical Experiments and Comparative Performance

Empirical evaluations in (Aragón-Artacho et al., 22 Oct 2025) focus on both the MSC problem and challenging nonconvex quadratic cases. The experimental protocol compares SNSM (nonmonotone and monotone variants) to established algorithms (DCA, iDCA, BDCA, RCSN, etc.), reporting metrics such as:

Problem Type	Evaluated Measures	Observed SNSM Performance
Quadratic Integer	Iterations, FunEvals, Time	Fewer iterations and function evaluations; competitive or better objective values
MSC	Iterations, FunEvals, Time	Lower running times, improved efficiency, smaller objective values

In MSC, the use of nonmonotone memory ( $m > 0$ ) allows broader exploration and generally accelerates convergence relative to strictly monotone approaches.

6. Advantages, Limitations, and Open Directions

Advantages:

Directional Flexibility: Accepts any descent direction, including second-order and quasi-Newton, suitable for integration with semi-Newton frameworks.
Self-adaptive Control: Automated tuning of step sizes and nonmonotone criteria minimizes manual intervention, promotes robustness across problem instances.
Theoretical Convergence: Guarantees subsequential convergence to stationary points under mild upper‑ $\mathcal{C}^2$ assumptions.
Empirical Efficiency: Demonstrates favorable complexity and solution quality compared to classical DC approaches and boosting techniques on both synthetic and real-world problems.

Limitations and Open Issues:

Subsequential Convergence Only: The method currently guarantees only that some limit points are stationary. Stronger guarantees (e.g., full global convergence or faster rates) remain under investigation.
Sensitivity to Direction Quality: Effectiveness depends on the choice of descent direction; performance may degrade if subgradients or Hessian approximations are poor.
Potential for Further Refinement: The parameter adaptation scheme, while effective, may be further improved and analyzed for different problem classes. Extension to complex constraints and broader nonsmooth models is plausible but yet to be established.

7. Conclusion

The SNSM framework fuses nonmonotone descent strategies with dynamic parameter adaptation for upper‑ $\mathcal{C}^2$ nonsmooth nonconvex minimization. By leveraging local quadratic models, flexible line search memory, and adaptive step sizing, it reconciles efficient practical performance with theoretical convergence—in particular for clustering and mixed-integer problems—while supporting extensions to second-order schemes. The method’s empirical superiority and theoretical generality make it a compelling choice for large-scale, complex nonsmooth optimization, with open questions around rates and constrained extensions (Aragón-Artacho et al., 22 Oct 2025).

PDF Markdown Chat (Pro)

References (1)

Nonmonotone subgradient methods based on a local descent lemma (2025)

Follow Topic

Get notified by email when new papers are published related to Self-adaptive Nonmonotone Subgradient Method (SNSM).