Proximal Alternating Linearized Minimization Algorithm

Updated 6 January 2026

Proximal Alternating Linearized Minimization (PALM) is a structured block-coordinate algorithm for nonconvex, nonsmooth optimization using proximal steps and linearization.
It alternates updates via block-specific surrogate functions, ensuring global convergence under the Kurdyka–Łojasiewicz property.
TiBPALM extends PALM by incorporating two-step inertia and Bregman distance regularization to accelerate convergence in applications like sparse recovery and imaging.

The Proximal Alternating Linearized Minimization (PALM) algorithm and its advanced variants, such as the Two-step Inertial Bregman PALM (TiBPALM), constitute a robust class of block-coordinate methods for structured optimization. These methods address general nonconvex, nonsmooth, and nonseparable two-block problems of the form

$\min_{x \in \mathbb{R}^l, \, y \in \mathbb{R}^m} \; L(x, y) = f(x) + Q(x, y) + g(y)$

where $f$ and $g$ are proper, lower-semicontinuous functions (possibly nonconvex and nonsmooth), and $Q$ is $C^1$ with Lipschitz continuous gradient. The PALM framework, particularly in its inertial and Bregman generalized forms, delivers global convergence guarantees under the Kurdyka–Łojasiewicz (KL) property, and allows block-specific generalizations critical for modern large-scale applications (Guo et al., 2023).

1. Class of Problems and Block Structure

The core setting involves partitioning the optimization variables into two (or, in some extensions, more) blocks, $x$ and $y$ , over which the objective $L(x, y) = f(x) + Q(x, y) + g(y)$ is defined. Here,

$f: \mathbb{R}^l \to (-\infty, +\infty]$ and $g: \mathbb{R}^m \to (-\infty, +\infty]$ are (potentially nonconvex, nonsmooth) proper, lsc functions.
$Q: \mathbb{R}^l \times \mathbb{R}^m \to \mathbb{R}$ is $C^1$ in $(x, y)$ with $\nabla Q$ Lipschitz continuous on bounded sets.
No separability of $f + Q + g$ is assumed.

This abstraction is broad enough to encompass constrained and penalized data fitting in signal recovery, machine learning, imaging, and quadratic fractional programming.

2. Standard PALM Methodology

The fundamental PALM scheme operates by alternating linearized majorizations of the block-coupled term $Q(x, y)$ and applying blockwise proximal steps:

At iterate $(x^k, y^k)$ , pick step-sizes $\lambda_k > 0$ , $\mu_k > 0$ . Perform

$x^{k+1} \in \arg \min_x \left\{ f(x) + \langle \nabla_x Q(x^k, y^k), x - x^k \rangle + \frac{1}{2\lambda_k} \| x - x^k \|^2 \right\}$

$y^{k+1} \in \arg \min_y \left\{ g(y) + \langle \nabla_y Q(x^{k+1}, y^k), y - y^k \rangle + \frac{1}{2\mu_k} \| y - y^k \|^2 \right\}$

This translates to block coordinate descent with local quadratic majorants. The PALM iteration admits broad regularity (no convexity of $f$ , $g$ , or $Q$ needed beyond the outlined smoothness/lsc assumptions).

3. Inertial and Bregman Extensions: TiBPALM

To accelerate convergence and handle generic geometry, the TiBPALM algorithm incorporates:

Two-step inertia: Extrapolations of the form $\alpha_{1k}(x_k - x_{k-1}) + \alpha_{2k}(x_{k-1} - x_{k-2})$ allow leverage of momentum from two preceding steps, generalizing standard heavy-ball approaches.
Bregman distance regularization: The squared norm is replaced with a Bregman kernel $D_\phi(u, v) := \phi(u) - \phi(v) - \langle \nabla \phi(v), u - v \rangle$ , typically with a strongly convex, smooth $\phi$ . This is critical when the geometry of proximal steps is non-Euclidean or a Euclidean prox is not easily solvable.

The TiBPALM update for block $x$ (similarly for $y$ ) is:

$x_{k+1} \in \arg\min_x \Big\{ f(x) + \langle \nabla_x Q(x_k, y_k), x \rangle + D_{\phi_1}(x, x_k) + \alpha_{1k}\langle x, x_{k-1} - x_k \rangle + \alpha_{2k} \langle x, x_{k-2} - x_{k-1} \rangle \Big\}$

Momentum and geometry are thus jointly exploited.

4. Benefit Function and Convergence Analysis

A distinctive feature of TiBPALM is the introduction of a benefit function $H(u,v,w)$ across three consecutive iterates:

$H(u, v, w) := L(u) + \frac{\alpha_1 + \alpha_2}{2}\| u - v \|^2 + \frac{\alpha_2}{2}\| v - w \|^2$

This legitimizes sufficient decrease arguments accounting for the inertial terms.

Under strong convexity of the Bregman kernels $\phi_i$ (constants $\theta_i$ ) and appropriate upper bounds on the inertia parameters $\alpha_i$ , one establishes the key decrease inequality:

$H(z_{k+1}, z_k, z_{k-1}) + a \| z_{k+1} - z_k \|^2 \leq H(z_k, z_{k-1}, z_{k-2})$

where $a = (\rho - 2(\alpha_1 + \alpha_2))/2 > 0$ and $\rho = \min\{\theta_1 - L_1^+,\, \theta_2 - L_2^+\}$ .

The KL property is invoked to conclude global convergence: the whole sequence $\{z_k\}$ has finite length and converges to a critical point of $L$ . This leverages the abstract KL-descent framework for general nonconvex, nonsmooth scenarios (Guo et al., 2023).

The inertial Bregman PALM framework synthesizes and generalizes several prior advancements:

iPALM (Inertial PALM) incorporates single-step momentum but uses standard prox-terms rather than Bregman distances (Pock et al., 2017).
Stochastic and variance-reduced schemes adapt PALM to finite-sum problems using variance-reduced estimators (SAGA, SARAH), with global convergence and accelerated rates established under similar KL-type assumptions (Guo et al., 2023, Driggs et al., 2020). The stochastic two-step inertial Bregman PALM (STiBPALM) extends TiBPALM to the stochastic regime.

A comparison is summarized below:

Method	Inertia	Proximal Geometry	Stochasticity	Reference
PALM	None	Euclidean	No	(Guo et al., 2023)
iPALM	1-step (heavy-ball)	Euclidean	No	(Pock et al., 2017)
TiBPALM	2-step	Bregman	No	(Guo et al., 2023)
STiBPALM	2-step	Bregman	Yes	(Guo et al., 2023)

The advantage of Bregman distances becomes pronounced when prox-subproblems associated with convex indicator or regularizer functions become tractable in non-Euclidean geometry (e.g., Kullback–Leibler distance for nonnegativity, half-shrinkage for $\ell_{1/2}$ ).

6. Empirical Performance and Applications

Numerical studies confirm TiBPALM’s superiority over both noninertial PALM and one-step inertial variants (iPALM, GiPALM) on diverse nonconvex benchmarks:

Sparse nonnegative matrix factorization with $\ell_0$ -sparsity and nonnegativity.
Sparse signal recovery under $\ell_{1/2}$ regularization.
Quadratic fractional programming with box constraints.

TiBPALM consistently achieved faster decrease in the objective (fewer iterations and reduced wall-clock time for the same accuracy), attributable to the combination of two-step inertial extrapolation and Bregman geometry, and enabled efficient closed-form inner updates in practical instances (Guo et al., 2023).

7. Extensions and Theoretical Significance

The PALM paradigm, especially in its inertial and Bregman-extended forms, underpins a rich landscape of modern nonconvex optimization algorithms. It readily admits:

Block-separable multimodal generalizations (BPALM/A-BPALM) (Ahookhosh et al., 2019).
Variable metric and composite proximal variants for composite nonsmooth terms (Yashtini, 2022).
Inexact and infeasible subsolver frameworks (PALM-I) with surrogate sequences restoring descent (Hu et al., 2022).
Unrolled, learned, and interpretable deep optimization networks, where the entire PALM structure is encoded in a parameter-efficient, convergence-guaranteed architecture (Chen et al., 2024).

These developments establish PALM and its advanced variants as foundational tools for scalable, structured, and theoretically grounded optimization in modern computational mathematics and machine learning.