Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Distributed Stochastic Variance Reduced Gradient Methods and A Lower Bound for Communication Complexity (1507.07595v2)

Published 27 Jul 2015 in math.OC, cs.LG, and stat.ML

Abstract: We study distributed optimization algorithms for minimizing the average of convex functions. The applications include empirical risk minimization problems in statistical machine learning where the datasets are large and have to be stored on different machines. We design a distributed stochastic variance reduced gradient algorithm that, under certain conditions on the condition number, simultaneously achieves the optimal parallel runtime, amount of communication and rounds of communication among all distributed first-order methods up to constant factors. Our method and its accelerated extension also outperform existing distributed algorithms in terms of the rounds of communication as long as the condition number is not too large compared to the size of data in each machine. We also prove a lower bound for the number of rounds of communication for a broad class of distributed first-order methods including the proposed algorithms in this paper. We show that our accelerated distributed stochastic variance reduced gradient algorithm achieves this lower bound so that it uses the fewest rounds of communication among all distributed first-order algorithms.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Jason D. Lee (151 papers)
  2. Qihang Lin (58 papers)
  3. Tengyu Ma (117 papers)
  4. Tianbao Yang (162 papers)
Citations (16)

Summary

We haven't generated a summary for this paper yet.