Randomized Feasibility Algorithm with Polyak Steps
- The paper introduces a randomized feasibility algorithm that replaces full projection onto intersected constraints with tractable, sampled Polyak subgradient updates.
- It employs adaptive, parameter-free step-size strategies to achieve linear convergence in strongly convex cases and optimal sublinear rates for general convex functions.
- Empirical evaluations on QCQP and SVM tasks demonstrate that the method maintains computational efficiency and competitive performance without rigorous parameter tuning.
A randomized feasibility algorithm with Polyak steps is a class of iterative methods for constrained convex optimization where computationally tractable projections onto each individual constraint set are used instead of direct projection onto the intersection of all constraints. At each iteration, the algorithm randomly samples constraints and projects the current point towards feasibility using subgradient steps of Polyak type. Adaptive, problem-parameter-free step-size rules and sampled constraint selection enable linear or sublinear convergence rates according to the regularity of the objective function, while maintaining computational practicality when the full constraint projection is prohibitive (Chakraborty et al., 27 Jan 2026).
1. Problem Formulation and Notation
The central problem is of the form
where
- is convex (possibly strongly convex and/or smooth),
- %%%%1%%%%, with each convex,
- is a simple closed convex set (such as a box or Euclidean ball).
Key notations:
- is the Euclidean norm,
- denotes projection onto ,
- ,
- .
A global error bound assumption is used: there exists and a sampling distribution over such that for all ,
2. Randomized Feasibility Algorithm with Polyak Steps
The algorithm performs a sequence of feasibility updates, each consisting of substeps at iteration . Each feasibility substep involves:
- Sampling a constraint index uniformly,
- Computing a subgradient ,
- Updating via the Polyak-type step: where is a parameter, and projection is onto .
After such substeps, . This scheme avoids projection onto , replacing it with computationally tractable projections onto and randomized selection of individual constraints.
Under the error-bound and bounded subgradient assumptions, the following hold:
- Nonexpansiveness: For any feasible , .
- Geometric decrease in infeasibility: where .
3. Interleaved Objective Minimization and Feasibility Updates
The algorithm alternates or interleaves randomized feasibility updates with (sub)gradient steps for objective minimization. Two major cases are considered:
Strongly Convex, -Smooth Objective
Assumptions:
- has -Lipschitz gradient,
- is -strongly convex.
Algorithm steps:
- Compute ,
- Update using the randomized feasibility algorithm with .
Adaptive Polyak-type step size: where is a prescribed accuracy.
Weighted averaging is used:
Convex, Possibly Nonsmooth Objective: Distance-over-Weighted-Subgradients (DoWS)
Assumptions:
- is convex (possibly nondifferentiable),
- is convex and bounded with diameter .
For iterations:
- Maintain ,
- ; ,
- ,
- Compute ,
- Randomized feasibility update as above.
A weighted average output minimizes .
4. Convergence Guarantees and Theoretical Rates
Strongly Convex, Smooth Case
For adaptive stepsizes as above and exponential weighting,
after outer iterations, provided the mean reduction in infeasibility per iteration meets a prescribed threshold (Chakraborty et al., 27 Jan 2026).
Convex, Possibly Nonsmooth Case
After iterations using DoWS with feasibility, the output satisfies
with \begin{align*} A_1(T) &= \frac{2 D M_f}{\sqrt{T}}\left(\frac{D}{r}\right){\frac{2}{T}\ln(e D2/r2)},\ A_2(\tau) &= D M_f \max_{1\le k\le\tau} \mathbb{E}[ (1-q){N_k/2} ],\ A_3(T) &= \frac{D M_f}{T} \left(\frac{D}{r}\right){\frac{2}{T}\ln(e D2/r2)} \sum_{k=1}{\tau} \mathbb{E}[(1-q){N_k/2}], \end{align*} yielding the optimal rate as up to sampling-determined terms.
For unbounded , a tamed (logarithmically adjusted) variant of the DoWS step-size ensures bounded iterates and the same expected error rate up to constants that grow logarithmically in .
5. Sampling Distribution Regimes and Computational Properties
Performance and theoretical rates depend critically on the sampling distribution of the number of feasibility substeps at each outer iteration. For common regimes:
- Deterministic polynomial growth: ensures that the sum is uniformly bounded.
- Poisson sampling: with yields , which decays polynomially in .
- Binomial sampling: with gives similar decay properties.
Sub-polynomial growth of suffices to make sampling-driven error negligible at polylogarithmic cost in total feasibility steps.
6. Empirical Evaluation: QCQP and SVM Applications
Simulations were conducted on two canonical classes of problems:
Quadratically Constrained Quadratic Programming (QCQP)
The problem: was tested in three regimes:
- (a) Strongly convex , known ,
- (b) Strongly convex, unknown ,
- (c) Convex , unknown .
Baselines included the Nedić et al subgradient-projection, Arrow–Hurwicz and Alt-GDA primal-dual schemes, ACVI (ADMM+log-barrier), and CVXPY interior-point.
Key observations:
- Adaptive Polyak-step algorithm achieved linear convergence in (a), requiring no prior knowledge of strong convexity or smoothness parameters.
- DoWS and T-DoWS performed competitively in (b), (c), attaining the expected rate slope.
- ACVI provided the fastest infeasibility decay but required expensive tuning.
Support Vector Machine (SVM) Soft-Margin Classification
For the SVM problem
the UCI Banknote, Breast-Cancer, and MNIST 3-vs-5 datasets were used. Only DoWS/T-DoWS and primal-dual (Arrow–Hurwicz/Alt-GDA) baselines were compared due to convexity.
Results:
- DoWS/T-DoWS schemes reduced objective and infeasibility rapidly;
- Test-set misclassification rates were competitive with cross-validated primal-dual methods;
- Methods required no parameter tuning.
7. Theoretical Significance and Practical Implications
Randomized feasibility algorithms with Polyak steps provide a rigorously justified, computation-efficient approach to large-scale constrained convex optimization where projection onto intersected constraints is intractable. Theoretical results guarantee:
- Linear convergence to any prespecified tolerance for strongly convex, -smooth ;
- Optimal rates in the convex, potentially nonsmooth setting;
- Bounded sampling-driven error without demanding hyperparameter tuning or explicit knowledge of problem parameters.
Empirical results indicate practical competitiveness against state-of-the-art first-order and primal-dual methods, particularly when problem structure or scale make conventional projection approaches prohibitively costly (Chakraborty et al., 27 Jan 2026).