Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
133 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Power of Generalized Smoothness in Stochastic Convex Optimization: First- and Zero-Order Algorithms (2501.18198v2)

Published 30 Jan 2025 in math.OC

Abstract: This paper is devoted to the study of stochastic optimization problems under the generalized smoothness assumption. By considering the unbiased gradient oracle in Stochastic Gradient Descent, we provide strategies to achieve in bounds the summands describing linear rate. In particular, in the case $L_0 = 0$, we obtain in the convex setup the iteration complexity: $N = \mathcal{O}\left(L_1R \log\frac{1}{\varepsilon} + \frac{L_1 c R2}{\varepsilon}\right)$ for Clipped Stochastic Gradient Descent and $N = \mathcal{O}\left(L_1R \log\frac{1}{\varepsilon}\right)$ for Normalized Stochastic Gradient Descent. Furthermore, we generalize the convergence results to the case with a biased gradient oracle, and show that the power of $(L_0,L_1)$-smoothness extends to zero-order algorithms. Finally, we demonstrate the possibility of linear convergence in the convex setup through numerical experimentation, which has aroused some interest in the machine learning community.

Summary

We haven't generated a summary for this paper yet.