Papers
Topics
Authors
Recent
Search
2000 character limit reached

Stochastic Bregman Proximal Gradient Method Revisited: Kernel Conditioning and Painless Variance Reduction

Published 6 Jan 2024 in math.OC | (2401.03155v3)

Abstract: We investigate Bregman proximal gradient (BPG) methods for solving nonconvex composite stochastic optimization problems. Instead of the standard gradient Lipschitz continuity (GLC) assumption, the objective function only satisfies a smooth-adaptability assumption w.r.t. some kernel function. An in-depth analysis of the stationarity measure is made in this paper, where we reveal an interesting fact that the widely adopted Bregman proximal gradient mapping in the existing works may not correctly depict the near stationarity of the solutions. To resolve this issue, a new Bregman proximal gradient mapping is proposed and analyzed in this paper. Second, a thorough analysis is made on the sample complexities of the stochastic Bregman proximal gradient methods under both the old and the newly proposed gradient mappings are analyzed. Note that the absence of GLC disables the standard analysis of the stochastic variance reduction techniques, existing stochastic BPG methods only obtain an $O(\epsilon{-2})$ sample complexity under the old gradient mapping, we show that such a limitation in the existing analyses mainly comes from the insufficient exploitation of the kernel's properties. By proposing a new kernel-conditioning regularity assumption on the kernel, we show that a simple epoch bound mechanism is enough to enable all the existing variance reduction techniques for stochastic BPG methods. Combined with a novel probabilistic argument, we show that there is a high probability event conditioning on which $O(\sqrt{n}\epsilon{-1})$ sample complexity for the finite-sum stochastic setting can be derived for the old gradient mapping. Moreover, with a novel adaptive step size control mechanism, we also show an $\tilde{O}(\sqrt{n}L_h(\mathcal{X}_\epsilon)\epsilon{-1})$ complexity under the new gradient mapping.

Authors (1)
Citations (1)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.