Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Communication-Efficient On-Device Machine Learning: Federated Distillation and Augmentation under Non-IID Private Data (1811.11479v2)

Published 28 Nov 2018 in cs.LG, cs.NI, and stat.ML

Abstract: On-device ML enables the training process to exploit a massive amount of user-generated private data samples. To enjoy this benefit, inter-device communication overhead should be minimized. With this end, we propose federated distillation (FD), a distributed model training algorithm whose communication payload size is much smaller than a benchmark scheme, federated learning (FL), particularly when the model size is large. Moreover, user-generated data samples are likely to become non-IID across devices, which commonly degrades the performance compared to the case with an IID dataset. To cope with this, we propose federated augmentation (FAug), where each device collectively trains a generative model, and thereby augments its local data towards yielding an IID dataset. Empirical studies demonstrate that FD with FAug yields around 26x less communication overhead while achieving 95-98% test accuracy compared to FL.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Eunjeong Jeong (8 papers)
  2. Seungeun Oh (11 papers)
  3. Hyesung Kim (12 papers)
  4. Jihong Park (123 papers)
  5. Mehdi Bennis (333 papers)
  6. Seong-Lyun Kim (81 papers)
Citations (535)

Summary

Communication-Efficient On-Device Machine Learning: Federated Distillation and Augmentation Under Non-IID Private Data

The paper "Communication-Efficient On-Device Machine Learning: Federated Distillation and Augmentation Under Non-IID Private Data" addresses a significant challenge in the domain of federated learning (FL) regarding communication overhead and non-IID data distributions across devices. The core contributions are the introduction of Federated Distillation (FD) and Federated Augmentation (FAug), which aim to enhance communication efficiency while maintaining high model accuracy.

Key Innovations

Federated Distillation (FD):

FD is proposed as a communication-efficient alternative to traditional FL by reducing the communication payload. Unlike FL, which requires transmitting large model parameters, FD operates by exchanging smaller-sized logit vectors. This is achieved using an online knowledge distillation approach wherein each device updates its model by comparing its outputs with globally averaged logits from other devices. This methodological shift allows the handling of larger local models without the prohibitive communication cost associated with full model exchanges.

Federated Augmentation (FAug):

To mitigate the performance degradation caused by non-IID data distributions, FAug employs a generative adversarial network (GAN) to augment local datasets, rendering them more IID-like. Each device contributes a few samples of inadequately represented classes to collaboratively train a GAN through a server. The server uses this to oversample and generate data, which assists devices in locally achieving more balanced datasets without significant data exchange, thereby protecting privacy and reducing communication costs.

Empirical Evaluation

The experimental evaluation highlights significant improvements in both communication overhead and model accuracy. When applying FD and FAug to a non-IID MNIST dataset, results indicate a remarkable reduction in communication cost by approximately 26x compared to traditional FL while achieving a test accuracy level of 95-98% of that of FL. The paper emphasizes the robustness of FD and FAug in adapting to non-IID datasets with significant reductions in accuracy loss compared to standalone mobile training.

Implications and Future Directions

The proposed FD and FAug strategies have important implications for on-device ML, particularly in scenarios where bandwidth is constrained or privacy concerns preclude the sharing of raw data. The methods ensure that model training remains efficient and scalable while respecting data privacy constraints.

Further research could explore hybrid approaches that balance the trade-offs between FL and FD, potentially leveraging FL for downlink communications where links are typically more robust. Additionally, integrating differential privacy mechanisms into FAug could further enhance data privacy without compromising model accuracy.

In summary, by focusing on efficient communication and non-IID data handling, this paper makes meaningful strides toward more practical and adaptable distributed ML frameworks. This enhances the feasibility of deploying advanced ML models across various devices with limited communication capabilities.