Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Utilizing Redundancy in Cost Functions for Resilience in Distributed Optimization and Learning (2110.10858v1)

Published 21 Oct 2021 in cs.DC and cs.LG

Abstract: This paper considers the problem of resilient distributed optimization and stochastic machine learning in a server-based architecture. The system comprises a server and multiple agents, where each agent has a local cost function. The agents collaborate with the server to find a minimum of their aggregate cost functions. We consider the case when some of the agents may be asynchronous and/or Byzantine faulty. In this case, the classical algorithm of distributed gradient descent (DGD) is rendered ineffective. Our goal is to design techniques improving the efficacy of DGD with asynchrony and Byzantine failures. To do so, we start by proposing a way to model the agents' cost functions by the generic notion of $(f, \,r; \epsilon)$-redundancy where $f$ and $r$ are the parameters of Byzantine failures and asynchrony, respectively, and $\epsilon$ characterizes the closeness between agents' cost functions. This allows us to quantify the level of redundancy present amongst the agents' cost functions, for any given distributed optimization problem. We demonstrate, both theoretically and empirically, the merits of our proposed redundancy model in improving the robustness of DGD against asynchronous and Byzantine agents, and their extensions to distributed stochastic gradient descent (D-SGD) for robust distributed machine learning with asynchronous and Byzantine agents.

Summary

We haven't generated a summary for this paper yet.