Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
153 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Variants of SGD for Lipschitz Continuous Loss Functions in Low-Precision Environments (2211.04655v7)

Published 9 Nov 2022 in math.OC and cs.LG

Abstract: Motivated by neural network training in low-precision arithmetic environments, this work studies the convergence of variants of SGD using adaptive step sizes with computational error. Considering a general stochastic Lipschitz continuous loss function, an asymptotic convergence result to a Clarke stationary point is proven as well as the non-asymptotic convergence to an approximate stationary point. It is assumed that only an approximation of the loss function's stochastic gradient can be computed in addition to error in computing the SGD step itself. Different variants of SGD are tested empirically, where improved test set accuracy is observed compared to SGD for two image recognition tasks.

Citations (1)

Summary

We haven't generated a summary for this paper yet.