Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Understanding How Consistency Works in Federated Learning via Stage-wise Relaxed Initialization (2306.05706v1)

Published 9 Jun 2023 in cs.LG, cs.DC, and math.OC

Abstract: Federated learning (FL) is a distributed paradigm that coordinates massive local clients to collaboratively train a global model via stage-wise local training processes on the heterogeneous dataset. Previous works have implicitly studied that FL suffers from the client-drift'' problem, which is caused by the inconsistent optimum across local clients. However, till now it still lacks solid theoretical analysis to explain the impact of this local inconsistency. To alleviate the negative impact of theclient drift'' and explore its substance in FL, in this paper, we first design an efficient FL algorithm \textit{FedInit}, which allows employing the personalized relaxed initialization state at the beginning of each local training stage. Specifically, \textit{FedInit} initializes the local state by moving away from the current global state towards the reverse direction of the latest local state. This relaxed initialization helps to revise the local divergence and enhance the local consistency level. Moreover, to further understand how inconsistency disrupts performance in FL, we introduce the excess risk analysis and study the divergence term to investigate the test error of the proposed \textit{FedInit} method. Our studies show that optimization error is not sensitive to this local inconsistency, while it mainly affects the generalization error bound in \textit{FedInit}. Extensive experiments are conducted to validate this conclusion. Our proposed \textit{FedInit} could achieve state-of-the-art~(SOTA) results compared to several advanced benchmarks without any additional costs. Meanwhile, stage-wise relaxed initialization could also be incorporated into the current advanced algorithms to achieve higher performance in the FL paradigm.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Yan Sun (309 papers)
  2. Li Shen (363 papers)
  3. Dacheng Tao (829 papers)
Citations (13)

Summary

We haven't generated a summary for this paper yet.