Papers
Topics
Authors
Recent
Search
2000 character limit reached

KL Exponent in Optimization

Updated 10 April 2026
  • KL Exponent is a quantitative measure of the local geometric regularity of extended-real-valued functions near critical points.
  • It establishes a power-law relationship between the subdifferential norm and suboptimality gap, directly influencing convergence rates of various optimization algorithms.
  • The framework applies broadly to structured optimization problems including sparse recovery, matrix completion, and decentralized multiagent systems, ensuring precise complexity guarantees.

A Kurdyka–Łojasiewicz (KL) exponent is a quantitative descriptor of the local geometric regularity of extended-real-valued functions near critical points, crucial in the analysis of convergence rates for nonconvex and nonsmooth optimization algorithms. The KL exponent provides a sharp power-law relationship between the subdifferential norm and the suboptimality gap, directly impacting the local convergence behavior of first-order and proximal-type methods. The KL exponent framework unifies broad classes of convex, weakly convex, and highly structured nonconvex problems, underpinning modern complexity guarantees for optimization algorithms in applications ranging from sparse recovery to matrix completion and decentralized multiagent systems.

1. Definition and Foundational Properties

Let f:RnR{+}f:\mathbb{R}^n\to\mathbb{R}\cup\{+\infty\} be proper and lower semicontinuous, and xx^* a critical point (0f(x)0\in\partial f(x^*)). The function ff satisfies the Kurdyka–Łojasiewicz property at xx^* with exponent θ[0,1)\theta\in[0,1) if there exist constants η>0,c>0\eta>0, c>0 and a continuous, concave desingularizing function φ(s)=cs1θ\varphi(s)=c\,s^{1-\theta} such that for all xx with 0<f(x)f(x)<η0<f(x)-f(x^*)<\eta the following KL inequality holds: xx^*0 which can equivalently be written as

xx^*1

The smallest such xx^*2 is termed the KL exponent at xx^*3. This exponent quantifies the “flatness” or “sharpness” of xx^*4 around xx^*5 (Chen et al., 2024, Qian et al., 2022, Li et al., 2023, Li et al., 2016).

Typical values and their interpretation:

  • xx^*6: finite termination, sharp minima.
  • xx^*7: local R-linear convergence.
  • xx^*8: sublinear power-law convergence rates.

The KL property, particularly with a known exponent, enables the derivation of explicit complexity guarantees for a wide spectrum of first-order schemes, including monotone, nonmonotone, and decentralized algorithms.

2. Characterizations, Computation, and Sharpness

There are multiple characterizations of the KL exponent based on variational analysis and subdifferential geometry. At a stationary point xx^*9, the exponent 0f(x)0\in\partial f(x^*)0 is sharp if no smaller value works. The modulus, defined as the supremal constant 0f(x)0\in\partial f(x^*)1 for which

0f(x)0\in\partial f(x^*)2

holds locally, can be computed via outer limiting subdifferentials of the function 0f(x)0\in\partial f(x^*)3 (Li et al., 2023).

A powerful insight is that for broad classes of functions—prox-regular, semi-algebraic, piecewise-smooth, and their inf-projections—the KL property of exponent 0f(x)0\in\partial f(x^*)4 may be characterized in terms of graphical derivatives or quadratic growth conditions. The quadratic growth condition

0f(x)0\in\partial f(x^*)5

is equivalent (under suitable regularity) to the KL property with 0f(x)0\in\partial f(x^*)6 (Pan et al., 2018, Li et al., 2023).

For nonsmooth or composite functions, subdifferential subregularity with respect to the critical set or the Moreau envelope approach provides systematic routes to verifying the KL-½ property (Pan et al., 2018, Li et al., 2023).

3. Calculus Rules and KL Exponent Preservation

KL exponents are preserved or tightly controlled under several function operations, enabling their propagation from elementary to highly-structured composite objectives. Formally, for proper lsc functions 0f(x)0\in\partial f(x^*)7, exponents 0f(x)0\in\partial f(x^*)8, and differentiable surjective maps 0f(x)0\in\partial f(x^*)9, the following hold (Li et al., 2016, Wang et al., 2021, Yu et al., 2019):

Construction KL Exponent Rule Reference
Minimum ff0 ff1 (Li et al., 2016)
Separable sum ff2 ff3 (Li et al., 2016)
Smooth composition ff4 ff5 (Li et al., 2016)
Moreau envelope ff6 (Li et al., 2016)
Inf-projection Preserved under conditions (Yu et al., 2019)
Square/Hadamard param. ff7 (Ouyang, 11 Jun 2025, Ouyang et al., 2024)

Generalized calculus rules that do not assume differentiable or power-law desingularizing functions further extend these results, admitting nondifferentiable forms and yielding exact modulus as smallest possible desingularizer (Wang et al., 2021). For instance, the Hadamard difference parametrization model for ff8-regularized losses propagates the KL exponent from the base model, with explicit rules (under strict complementarity) that guarantee ff9 at second-order points (Ouyang et al., 2024).

4. Algorithmic and Complexity Implications

The KL exponent is the determining constant for local complexity in a wide array of first-order optimization methods. Under two general algorithmic axioms—nonmonotone descent and relative error—the convergence behavior of iterates generated by the optimization algorithm is dictated entirely by the KL exponent θ (Qian et al., 2022, Chen et al., 2024):

  • xx^*0: finite-step convergence.
  • xx^*1: global (R-)linear convergence xx^*2.
  • xx^*3: sublinear, polynomial rate xx^*4.

The same exponent governs the decay of the objective gap xx^*5. These complexity results apply across monotone descent, nonmonotone search, block-coordinate, and decentralized gradient-tracking methods, including but not limited to proximal gradient, inertial, and alternating minimization algorithms (Li et al., 2016, Qian et al., 2022, Chen et al., 2024).

In decentralized multiagent settings, e.g., for SONATA gradient tracking over networks, the global convergence rate precisely mirrors the KL exponent regime of the centralized problem, up to network spectral gap effects. For models like LASSO or nonconvex PCA with xx^*6, this yields R-linear convergence for both settings (Chen et al., 2024).

5. Canonical Models and Explicit Exponents

For a wide class of structured optimization models frequently encountered in signal processing, machine learning, and statistical estimation, the KL exponent can be computed or tightly estimated. Canonical examples (Li et al., 2016, Pan et al., 2018, Bi et al., 2019, Tao et al., 2019, Chen et al., 2024) include:

  • Quadratic + xx^*7 models (LASSO), smoothly clipped-absolute-deviation (SCAD), minimax concave penalty (MCP): xx^*8.
  • Logistic regression with xx^*9 penalty: θ[0,1)\theta\in[0,1)0.
  • Factorized low-rank matrix recovery (with squared F-norm or θ[0,1)\theta\in[0,1)1-norm): θ[0,1)\theta\in[0,1)2 on (neighborhoods of) global minimizers under restricted isometry or condition number assumptions (Tao et al., 2019, Bi et al., 2019).
  • Rank-constrained and rank-regularized models: exponent θ[0,1)\theta\in[0,1)3 holds on structured sets under explicit geometric assumptions (Bi et al., 2019, Tao et al., 2019).
  • Decentralized structured nonconvex optimization (e.g., decentralized PCA, LASSO via SONATA): θ[0,1)\theta\in[0,1)4 yields R-linear complexity (Chen et al., 2024).

For higher-degree polynomials or deep neural networks with nonsmooth activations, one typically obtains θ[0,1)\theta\in[0,1)5, with bounds such as θ[0,1)\theta\in[0,1)6 for a real-analytic polynomial θ[0,1)\theta\in[0,1)7 (Chen et al., 2024).

There is a fundamental equivalence between the KL-½ property, metric subregularity of the subdifferential, and local quadratic growth under convexity, prox-regularity, or tame geometry (Pan et al., 2018, Li et al., 2023). In particular:

  • For convex, lower-semicontinuous θ[0,1)\theta\in[0,1)8, the following are equivalent at stationary points:

    1. Subdifferential is metrically subregular.
    2. θ[0,1)\theta\in[0,1)9 has a local quadratic growth bound.
    3. KL property with η>0,c>0\eta>0, c>00.
  • Similar equivalences hold for locally uniform prox-regular or semi-algebraic functions, with value separation on the critical set ensuring subregularity implies KL-½.

These mechanisms enable error-bound based verification of KL exponents (notably via the Luo–Tseng error bound), allowing for “machine-verifiable” certification of linear rates in complex nonsmooth problems, such as sparse quadratic minimization under cardinality constraints or composite factorized settings (Pan et al., 2018, Li et al., 2016, Li et al., 2023).

7. Extensions, Exact Moduli, and Generalizations

Recent work generalizes the KL exponent formalism beyond the canonical power-function desingularizers to broader classes of concave, possibly nondifferentiable functions, defining an exact modulus for the KL property (Wang et al., 2021). This allows for sharp calculus results and convergence rate estimates in cases where the power law form is suboptimal or fails, such as for “super-flat,” piecewise, or composite functions with intricate geometric structures.

Advances also refine the behavior of KL exponents under reparameterizations such as the square transformation or Hadamard parameterizations, connecting the exponent to that of the original problem or showing sharp lower bounds (e.g., under strict complementarity, the KL exponent of a square-reparameterized problem is η>0,c>0\eta>0, c>01 if the original exponent is η>0,c>0\eta>0, c>02) (Ouyang, 11 Jun 2025, Ouyang et al., 2024).


References: (Li et al., 2016, Pan et al., 2018, Yu et al., 2019, Bi et al., 2019, Tao et al., 2019, Wang et al., 2021, Qian et al., 2022, Li et al., 2023, Ouyang et al., 2024, Chen et al., 2024, Ouyang, 11 Jun 2025).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to KL Exponent.