Papers
Topics
Authors
Recent
Search
2000 character limit reached

Weakly Convex Functions: Theory & Applications

Updated 19 April 2026
  • Weakly convex functions are proper, lower-semicontinuous functions that become convex when a quadratic term is added, generalizing both convex and smooth functions.
  • They exhibit stability under sum and composition, possess strong proximal properties, and underlie effective algorithms in nonsmooth, nonconvex optimization.
  • Their applications span robust estimation, sparse regression, and machine learning, with convergence guarantees provided by proximal and stochastic methods.

A weakly convex function is a proper, lower-semicontinuous function f:RdR{+}f: \mathbb{R}^d \rightarrow \mathbb{R} \cup \{+\infty\} that admits a bounded negative curvature in a precise sense: ff is called ρ\rho-weakly convex if there exists ρ0\rho \geq 0 such that xf(x)+ρ2x2x \mapsto f(x) + \tfrac{\rho}{2} \|x\|^2 is convex. This class strictly generalizes convex functions (ρ=0\rho = 0), subsumes smooth functions with Lipschitz gradient (ρ\rho is the gradient Lipschitz constant), and is central in modern nonsmooth and nonconvex optimization theory.

1. Fundamental Definitions and Characterizations

Weak convexity can be formulated via several equivalent but operationally distinct conditions, all appearing widely in the literature:

  • Quadratic perturbation: ff is ρ\rho-weakly convex if f(x)+ρ2x2f(x) + \frac{\rho}{2}\|x\|^2 is convex.
  • Subgradient lower bound: For all ff0 and any ff1,

ff2

  • Secant inequality: For all ff3, ff4,

ff5

  • Differentiable case: If ff6 is ff7, then weak convexity is equivalent to

ff8

These conditions admit natural generalization to Banach and Hilbert spaces and underpin the analytic and algorithmic properties of weakly convex functions.

2. Key Properties and Examples

Weakly convex functions preserve many of the powerful properties of convex analysis while accommodating mild nonconvexities:

  • Stability under sums and composition: The sum of a ρ\rho4- and a ρ\rho5-weakly convex function is ρ\rho6-weakly convex. Compositions ρ\rho7, with ρ\rho8 convex and Lipschitz, and ρ\rho9 ρ0\rho \geq 00 with Lipschitz Jacobian, are ρ0\rho \geq 01-weakly convex, where ρ0\rho \geq 02 and ρ0\rho \geq 03 are the respective Lipschitz constants (Davis et al., 2018, Ma et al., 2019).
  • Closure under supremum: The supremum of uniformly ρ0\rho \geq 04-weakly convex functions is also ρ0\rho \geq 05-weakly convex (López-Rivera et al., 1 Feb 2025).
  • Proximal properties: For any ρ0\rho \geq 06, ρ0\rho \geq 07 is strongly convex in ρ0\rho \geq 08, so the proximal operator is single-valued and Lipschitz (Renaud et al., 17 Sep 2025).
  • Active examples:

3. Moreau Envelope and Proximal Calculus

A central tool for weakly convex analysis is the Moreau envelope: xf(x)+ρ2x2x \mapsto f(x) + \tfrac{\rho}{2} \|x\|^20 with associated proximal map xf(x)+ρ2x2x \mapsto f(x) + \tfrac{\rho}{2} \|x\|^21. For weakly convex xf(x)+ρ2x2x \mapsto f(x) + \tfrac{\rho}{2} \|x\|^22:

  • xf(x)+ρ2x2x \mapsto f(x) + \tfrac{\rho}{2} \|x\|^23 is everywhere finite and xf(x)+ρ2x2x \mapsto f(x) + \tfrac{\rho}{2} \|x\|^24; xf(x)+ρ2x2x \mapsto f(x) + \tfrac{\rho}{2} \|x\|^25.
  • xf(x)+ρ2x2x \mapsto f(x) + \tfrac{\rho}{2} \|x\|^26 is xf(x)+ρ2x2x \mapsto f(x) + \tfrac{\rho}{2} \|x\|^27-Lipschitz (Renaud et al., 17 Sep 2025).
  • xf(x)+ρ2x2x \mapsto f(x) + \tfrac{\rho}{2} \|x\|^28 pointwise as xf(x)+ρ2x2x \mapsto f(x) + \tfrac{\rho}{2} \|x\|^29.
  • The Moreau envelope preserves minimizers and critical points (Renaud et al., 17 Sep 2025).
  • The gradient ρ=0\rho = 00 serves as a natural stationarity measure for nonconvex, nonsmooth problems, and underpins optimality guarantees in algorithmic schemes (Davis et al., 2018, Davis et al., 2018).

For inexact proximal computations, detailed calculus using ε-subdifferentials is available, establishing rigorous inexact stationarity conditions and sum rules for composite functions (Bednarczuk et al., 2022).

4. Optimization Algorithms, Complexity, and Regularity

Optimization of weakly convex objectives leverages the structure through proximal, subgradient, and first-order splitting algorithms:

Algorithmic Frameworks

  • Proximal Point and Proximal Gradient Methods: These methods operate directly or with inexact solves on the Moreau envelope and enjoy convergence guarantees when suitable regularity (e.g., Kurdyka–Łojasiewicz (KL) property) is present (Khanh et al., 2023, Liao et al., 2 Sep 2025).
  • Stochastic Subgradient Methods: For composite problems ρ=0\rho = 01, the stochastic subgradient or proximity-based variants yield convergence of the Moreau envelope gradient at rate ρ=0\rho = 02, settling the rate for nonconvex, nonsmooth composite stochastic optimization (Davis et al., 2018, Davis et al., 2018).
  • Variable Smoothing Schemes: By decreasing the smoothing parameter, algorithms interpolate between smooth and nonsmooth rates, obtaining dimension-independent complexity ρ=0\rho = 03 for composite structured problems (Böhm et al., 2020, López-Rivera et al., 1 Feb 2025).
  • Quadratically Regularized Subgradient for Constrained Optimization: By regularizing both objective and constraints, provable complexity guarantees are established for finding nearly stationary points under uniform Slater conditions (Ma et al., 2019).
  • Primal-Dual and Forward–Backward Algorithms: When sharpness holds, linear convergence rates can be attained globally or locally for primal-dual and splitting schemes targeting weakly convex (and possibly nonconvex) objectives (Bednarczuk et al., 2023, Bednarczuk et al., 2024).

Regularity Conditions, Error Bounds, and Linear Convergence

Regularity conditions for weakly convex functions mirror, but generalize, those in the convex setting. On any sublevel set, there exists a chain of implications: ρ=0\rho = 04 (Liao et al., 2023). Under quadratic growth, linear convergence of (inexact) proximal point and forward–backward algorithms is established.

5. Saddle Points, Sharpness, and Generic Avoidance

A notable structural property of weakly convex objectives is the landscape organization: generic weakly convex, o-minimal (definable) functions possess only local minimizers and "active strict saddles." Proximal-point, subgradient, and stochastic algorithms provably avoid strict saddles almost surely, converging instead to minimizers (Davis et al., 2019, Bianchi et al., 2021, Huang, 2021). The geometric mechanism is the instability of strict saddle fixed points under the proximal update and landscape sharpness away from active manifolds. Random perturbation methods accelerate escape from saddle traps even in nonsmooth cases (Huang, 2021).

Sharpness—a linear growth condition away from minimizers—enables local (and sometimes global) linear convergence for subgradient and forward–backward schemes on weakly convex objectives, conditional on initialization in a basin of attraction (1803.02461, Bednarczuk et al., 2023, Bednarczuk et al., 2024).

6. Second-order Calculus and Convexity Characterization

Recent work leverages generalized second-order subderivatives and coderivatives to precisely demarcate the convexity of weakly convex functions (Phat, 26 Mar 2026):

  • Graphical derivatives of the subgradient mapping: Convexity is equivalent to the positive semi-definiteness of the graphical derivative in each direction.
  • Second subderivatives: Convexity is equivalent to non-negativity of the second subderivative for all directions at each subgradient pair.
  • Second-order subdifferential: Convexity holds if ρ=0\rho = 05 for all ρ=0\rho = 06 and all ρ=0\rho = 07.

These characterizations unify various fragments of second-order analysis across generalized convexity and inform Newton-type methods.

7. Applications and Practical Implications

Weakly convex models are prevalent in contemporary statistical and machine learning models, including high-dimensional robust estimation, sparse regression, dictionary learning, phase retrieval, robust PCA, and distributionally robust optimization (Davis et al., 2018, Shen et al., 2017, López-Rivera et al., 1 Feb 2025).

  • Sparsity-inducing regularization: Weakly convex penalties (like MCP, SCAD, firm-threshold) encode ρ=0\rho = 08-like properties while maintaining algorithmic tractability and provable stationarity when used with proximal-gradient descent (Shen et al., 2017).
  • Nonconvex regularization in deep learning: Nonsmooth yet weakly convex loss surfaces, as in ReLU networks and robust estimators, are amenable to first-order and splitting methods.
  • Supremum of weakly convex functions: Moreau envelope calculus extends to pointwise maxima and supremum operations, allowing envelope and proximity operator computation for classes of DRO and min-max problems (López-Rivera et al., 1 Feb 2025).

Algorithmically, the Moreau envelope enables efficient smoothing, splitting, and stochastic optimization frameworks in large-scale settings without reliance on variance reduction, mini-batching, or strong convexity (Davis et al., 2018, Renaud et al., 17 Sep 2025).


References:

(Davis et al., 2018, Davis et al., 2018, Ma et al., 2019, Böhm et al., 2020, Renaud et al., 17 Sep 2025, Bednarczuk et al., 2022, Bednarczuk et al., 2023, Huang, 2021, Bednarczuk et al., 2024, Shen et al., 2017, Davis et al., 2019, Bianchi et al., 2021, López-Rivera et al., 1 Feb 2025, Liao et al., 2023, Phat, 26 Mar 2026, Liao et al., 2 Sep 2025, Khanh et al., 2023, 1803.02461).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (18)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Weakly Convex Functions.