On Markov Chain Gradient Descent (1809.04216v1)

Published 12 Sep 2018 in math.OC and stat.ML

Abstract: Stochastic gradient methods are the workhorse (algorithms) of large-scale optimization problems in machine learning, signal processing, and other computational sciences and engineering. This paper studies Markov chain gradient descent, a variant of stochastic gradient descent where the random samples are taken on the trajectory of a Markov chain. Existing results of this method assume convex objectives and a reversible Markov chain and thus have their limitations. We establish new non-ergodic convergence under wider step sizes, for nonconvex problems, and for non-reversible finite-state Markov chains. Nonconvexity makes our method applicable to broader problem classes. Non-reversible finite-state Markov chains, on the other hand, can mix substatially faster. To obtain these results, we introduce a new technique that varies the mixing levels of the Markov chains. The reported numerical results validate our contributions.

Citations (95)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

On Markov Chain Gradient Descent (1809.04216v1)

Summary

Related Papers