Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Adaptive Self-Distillation for Minimizing Client Drift in Heterogeneous Federated Learning (2305.19600v3)

Published 31 May 2023 in cs.LG

Abstract: Federated Learning (FL) is a machine learning paradigm that enables clients to jointly train a global model by aggregating the locally trained models without sharing any local training data. In practice, there can often be substantial heterogeneity (e.g., class imbalance) across the local data distributions observed by each of these clients. Under such non-iid data distributions across clients, FL suffers from the 'client-drift' problem where every client drifts to its own local optimum. This results in slower convergence and poor performance of the aggregated model. To address this limitation, we propose a novel regularization technique based on adaptive self-distillation (ASD) for training models on the client side. Our regularization scheme adaptively adjusts to the client's training data based on the global model entropy and the client's label distribution. The proposed regularization can be easily integrated atop existing, state-of-the-art FL algorithms, leading to a further boost in the performance of these off-the-shelf methods. We theoretically explain how ASD reduces client-drift and also explain its generalization ability. We demonstrate the efficacy of our approach through extensive experiments on multiple real-world benchmarks and show substantial gains in performance over state-of-the-art methods.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. M. Yashwanth (1 paper)
  2. Gaurav Kumar Nayak (20 papers)
  3. Arya Singh (1 paper)
  4. Yogesh Simmhan (59 papers)
  5. Anirban Chakraborty (52 papers)