Bregman Regularized Proximal Point Algorithm
- Bregman Regularized Proximal Point Algorithm is a generalization of the classical proximal point method that uses Bregman divergences to capture problem geometry and constraints.
- The method supports inexact updates and accelerates convergence from O(1/N) to O(1/N²) through controlled error tolerances and Nesterov-type mixing.
- It is widely applied in convex optimization, equilibrium problems, unbalanced optimal transport, and stochastic settings, offering both theoretical guarantees and practical computational benefits.
The Bregman Regularized Proximal Point Algorithm is a generalization of the classical proximal point approach for finding zeros of monotone operators or minimizers of convex (and more generally, nonconvex or composite) functions. It leverages Bregman divergences—parameterized by strictly convex, smooth “distance-generating” functions—to regularize the update steps, enabling iterations to better reflect problem geometry and constraints. The Bregman framework underpins methodological advances across convex optimization, equilibrium problems, optimal transport, and large-scale machine learning, offering both theoretical guarantees and practical computational benefits.
1. Foundations: Bregman Divergence and Proximal Updates
Let be a Legendre function on a convex domain (i.e., strictly convex, differentiable on its interior, with at the boundary). The associated Bregman divergence is
For , reduces to the squared Euclidean norm; for , it delivers the Kullback–Leibler (KL) divergence.
The (exact) Bregman proximal point update for minimizing a convex is given by:
where is the stepsize parameter. The first-order optimality condition reads:
This construction encompasses Euclidean PPA, mirror descent, and entropy-regularized iterations as special cases (Jiang et al., 2022, Zhou et al., 2015).
2. Inexact and Accelerated Bregman Proximal Point Methods
Solving each subproblem exactly is often prohibitively costly or impractical. Inexact variants relax the requirement by allowing a controlled error, typically subject to summability:
where denotes the -subdifferential (Chen et al., 2024, Yang et al., 2021).
Acceleration builds on estimate-sequence or Nesterov-type constructions. Auxiliary sequences and mixing weights are introduced, leading to iterations such as: \begin{align*} yk & = \theta_k zk + (1-\theta_k) xk,\ x{k+1} &\approx \arg\min_{x \in X} \big{ f(x) + (1/\gamma_k) D_h(x, yk) \big},\ z{k+1} & = \arg\min_{x \in X} H_{k+1}(x), \end{align*} with an appropriately defined estimate function. Rates improve from to , where under strong convexity and Lipschitz assumptions, so convergence is attained (Yang et al., 2021, Chen et al., 2024, Yan et al., 2020).
Summary Table (rate and conditions):
| Method | Required Conditions | Complexity Rate |
|---|---|---|
| BPPA | Convex , strongly convex | |
| Accelerated | Quadratic scaling / Nesterov acceleration | |
| Entropic | Joint convexity of (e.g. KL) |
3. Bregman Proximal Point in Structured and Stochastic Settings
Extensions encompass nonconvex, composite, and stochastic objectives. In composite minimization, the Bregman–proximal–gradient method updates via:
where is smooth and is proximable (Zhou et al., 2015, Guilmeau et al., 2022).
Variance-reduced stochastic algorithms (e.g., SAGA/SVRG-like schemes) apply the Bregman regularization to each stochastic subproblem:
where is a control variate correction ensuring (in expectation) unbiasedness for the global proximal mapping. Such schemes admit sublinear or linear rates depending on convexity and relative smoothness properties (Traoré et al., 18 Oct 2025, Wang et al., 2024).
4. Applications to Unbalanced Optimal Transport
The inexact Bregman proximal point method has demonstrated effectiveness for unbalanced optimal transport (UOT) problems, where the objective is: $\min_{P \ge 0} \langle C, P \rangle + \tau_1 \mathrm{KL}(P \mathbbm{1}_m \| a) + \tau_2 \mathrm{KL}(P^T \mathbbm{1}_n \| b).$ Choosing produces a matrix–KL regularization, and the subproblem becomes a generalized Sinkhorn scaling (Chen et al., 2024). The IBPUOT algorithm runs a fixed number (often just one) of internal scaling updates per outer loop and terminates when the inexactness criterion is satisfied:
IBPUOT provably converges to the UOT solution under summable errors, with convergence and complexity essentially matching the true-solution complexity of classical scaling, but with far improved numerical stability for small regularization. The accelerated version AIBPUOT further reduces iteration count through estimate-sequence mixing, yielding rates (Chen et al., 2024).
5. Theoretical Guarantees and Convergence Rates
The canonical one-step decrease identity underpinning Bregman proximal-point convergence reads:
for arbitrary feasible (Jiang et al., 2022, Zhou et al., 2015, Yan et al., 2020). Upon summing over iterations, this yields telescoping bounds, with immediate consequences:
- Monotonic descent: is nonincreasing.
- Ergodic/sublinear rate: For constant , the suboptimality decays as .
- Quadratic scaling/acceleration: For kernels with triangle/“quadrangle” scaling properties (i.e., ), the rate improves to ; e.g., for strongly convex and smooth (Yang et al., 2021, Chen et al., 2024, Yan et al., 2020).
When the inexactness sequence (from approximate subproblem solutions) is absolutely summable, convergence is preserved, and accelerated methods retain their improved rates under mild scaling-hypotheses (Chen et al., 2024, Yang et al., 2021).
6. Impact of the Divergence Generator and Problem Geometry
The choice of the Bregman kernel (“distance-generating function”) critically affects both convergence and the implicit bias of the method. For linear classification with separable data, BPPA with a fixed yields:
where is the maximal margin under the chosen norm, and are strong convexity and smoothness parameters of (Li et al., 2021). Thus the “condition number” of directly controls the guaranteed margin; ill-conditioning may degrade generalization guarantees.
Further, when reflects the manifold or simplex constraints (e.g., entropic regularization, Kullback–Leibler divergence), updates become multiplicative and naturally enforce sparse or simplex-structured solutions, which is advantageous for tasks such as optimal transport or variational inference (Chen et al., 2024, Guilmeau et al., 2022).
7. Extensions: Manifolds, Nonconvexity, and Equilibrium Problems
The Bregman regularized proximal point paradigm extends to Hadamard manifolds (complete simply connected spaces of nonpositive curvature). Here, the Bregman distance is defined in terms of geodesics, and convexity is replaced by geodesic convexity. Under additional boundedness and coercivity conditions on the kernel, convergence to equilibrium solutions can be established despite the local nonconvexity of the Bregman term (Sharma et al., 20 Jan 2026).
Nonconvex and composite problems are handled by replacing with locally accurate convex models; line-search and descent conditions ensure convergence to Clarke stationary points under minimal regularity and growth assumptions (Ochs et al., 2017, Wang et al., 2024).
References:
- (Chen et al., 2024) An inexact Bregman proximal point method and its acceleration version for unbalanced optimal transport.
- (Yang et al., 2021) Bregman Proximal Point Algorithm Revisited: A New Inexact Version and its Inertial Variant.
- (Jiang et al., 2022) Bregman three-operator splitting methods.
- (Zhou et al., 2015) A Simple Convergence Analysis of Bregman Proximal Gradient Algorithm.
- (Traoré et al., 18 Oct 2025) Bregman Stochastic Proximal Point Algorithm with Variance Reduction.
- (Wang et al., 2024) A Bregman Proximal Stochastic Gradient Method with Extrapolation for Nonconvex Nonsmooth Problems.
- (Sharma et al., 20 Jan 2026) A Bregman Regularized Proximal Point Method for Solving Equilibrium Problems on Hadamard Manifolds.
- (Guilmeau et al., 2022) Regularized Rényi divergence minimization through Bregman proximal gradient algorithms.
- (Ochs et al., 2017) Non-smooth Non-convex Bregman Minimization: Unification and new Algorithms.
- (Yan et al., 2020) Bregman Augmented Lagrangian and Its Acceleration.
- (Li et al., 2021) Implicit Regularization of Bregman Proximal Point Algorithm and Mirror Descent on Separable Data.