Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
166 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Variance-reduced first-order methods for deterministically constrained stochastic nonconvex optimization with strong convergence guarantees (2409.09906v3)

Published 16 Sep 2024 in math.OC, cs.LG, cs.NA, math.NA, and stat.ML

Abstract: In this paper, we study a class of deterministically constrained stochastic optimization problems. Existing methods typically aim to find an $\epsilon$-stochastic stationary point, where the expected violations of both constraints and first-order stationarity are within a prescribed accuracy $\epsilon$. However, in many practical applications, it is crucial that the constraints be nearly satisfied with certainty, making such an $\epsilon$-stochastic stationary point potentially undesirable due to the risk of significant constraint violations. To address this issue, we propose single-loop variance-reduced stochastic first-order methods, where the stochastic gradient of the stochastic component is computed using either a truncated recursive momentum scheme or a truncated Polyak momentum scheme for variance reduction, while the gradient of the deterministic component is computed exactly. Under the error bound condition with a parameter $\theta \geq 1$ and other suitable assumptions, we establish that these methods respectively achieve a sample and first-order operation complexity of $\widetilde O(\epsilon{-\max{\theta+2, 2\theta}})$ and $\widetilde O(\epsilon{-\max{4, 2\theta}})$ for finding a stronger $\epsilon$-stochastic stationary point, where the constraint violation is within $\epsilon$ with certainty, and the expected violation of first-order stationarity is within $\epsilon$. For $\theta=1$, these complexities reduce to $\widetilde O(\epsilon{-3})$ and $\widetilde O(\epsilon{-4})$ respectively, which match, up to a logarithmic factor, the best-known complexities achieved by existing methods for finding an $\epsilon$-stochastic stationary point of unconstrained smooth stochastic optimization problems.

Citations (1)

Summary

We haven't generated a summary for this paper yet.