Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
143 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Achieving Linear Speedup with ProxSkip in Distributed Stochastic Optimization (2310.07983v3)

Published 12 Oct 2023 in cs.LG, math.OC, and stat.ML

Abstract: The ProxSkip algorithm for distributed optimization is gaining increasing attention due to its proven benefits in accelerating communication complexity while maintaining robustness against data heterogeneity. However, existing analyses of ProxSkip are limited to the strongly convex setting and do not achieve linear speedup, where convergence performance increases linearly with respect to the number of nodes. So far, questions remain open about how ProxSkip behaves in the non-convex setting and whether linear speedup is achievable. In this paper, we revisit distributed ProxSkip and address both questions. We demonstrate that the leading communication complexity of ProxSkip is $\mathcal{O}(\frac{p\sigma2}{n\epsilon2})$ for non-convex and convex settings, and $\mathcal{O}(\frac{p\sigma2}{n\epsilon})$ for the strongly convex setting, where $n$ represents the number of nodes, $p$ denotes the probability of communication, $\sigma2$ signifies the level of stochastic noise, and $\epsilon$ denotes the desired accuracy level. This result illustrates that ProxSkip achieves linear speedup and can asymptotically reduce communication overhead proportional to the probability of communication. Additionally, for the strongly convex setting, we further prove that ProxSkip can achieve linear speedup with network-independent stepsizes.

Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com