Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

An Accelerated Distributed Stochastic Gradient Method with Momentum (2402.09714v2)

Published 15 Feb 2024 in math.OC, cs.DC, and cs.MA

Abstract: In this paper, we introduce an accelerated distributed stochastic gradient method with momentum for solving the distributed optimization problem, where a group of $n$ agents collaboratively minimize the average of the local objective functions over a connected network. The method, termed ``Distributed Stochastic Momentum Tracking (DSMT)'', is a single-loop algorithm that utilizes the momentum tracking technique as well as the Loopless Chebyshev Acceleration (LCA) method. We show that DSMT can asymptotically achieve comparable convergence rates as centralized stochastic gradient descent (SGD) method under a general variance condition regarding the stochastic gradients. Moreover, the number of iterations (transient times) required for DSMT to achieve such rates behaves as $\mathcal{O}(n{5/3}/(1-\lambda))$ for minimizing general smooth objective functions, and $\mathcal{O}(\sqrt{n/(1-\lambda)})$ under the Polyak-{\L}ojasiewicz (PL) condition. Here, the term $1-\lambda$ denotes the spectral gap of the mixing matrix related to the underlying network topology. Notably, the obtained results do not rely on multiple inter-node communications or stochastic gradient accumulation per iteration, and the transient times are the shortest under the setting to the best of our knowledge.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Kun Huang (85 papers)
  2. Shi Pu (109 papers)
  3. Angelia Nedić (67 papers)
Citations (5)

Summary

We haven't generated a summary for this paper yet.