Variants of SGD for Lipschitz Continuous Loss Functions in Low-Precision Environments (2211.04655v7)

Published 9 Nov 2022 in math.OC and cs.LG

Abstract: Motivated by neural network training in low-precision arithmetic environments, this work studies the convergence of variants of SGD using adaptive step sizes with computational error. Considering a general stochastic Lipschitz continuous loss function, an asymptotic convergence result to a Clarke stationary point is proven as well as the non-asymptotic convergence to an approximate stationary point. It is assumed that only an approximation of the loss function's stochastic gradient can be computed in addition to error in computing the SGD step itself. Different variants of SGD are tested empirically, where improved test set accuracy is observed compared to SGD for two image recognition tasks.

Citations (1)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Variants of SGD for Lipschitz Continuous Loss Functions in Low-Precision Environments (2211.04655v7)

Summary

Related Papers