HeteroFL: Computation and Communication Efficient Federated Learning for Heterogeneous Clients (2010.01264v3)

Published 3 Oct 2020 in cs.LG and stat.ML

Abstract: Federated Learning (FL) is a method of training machine learning models on private data distributed over a large number of possibly heterogeneous clients such as mobile phones and IoT devices. In this work, we propose a new federated learning framework named HeteroFL to address heterogeneous clients equipped with very different computation and communication capabilities. Our solution can enable the training of heterogeneous local models with varying computation complexities and still produce a single global inference model. For the first time, our method challenges the underlying assumption of existing work that local models have to share the same architecture as the global model. We demonstrate several strategies to enhance FL training and conduct extensive empirical evaluations, including five computation complexity levels of three model architecture on three datasets. We show that adaptively distributing subnetworks according to clients' capabilities is both computation and communication efficient.

PDF Abstract

Overview of HeteroFL: Efficient Federated Learning for Heterogeneous Clients

This paper introduces HeteroFL, a novel framework for federated learning (FL) designed to address the challenges posed by heterogeneous clients with varying computational and communication capabilities. The authors propose an approach that allows for the training of local models with different computational complexities while still generating a unified global inference model. This framework breaks from the conventional assumption that local models must mirror the global model's architecture.

Key Contributions

The paper highlights several key contributions:

Model Heterogeneity: HeteroFL enables the training of local models with variable complexity, effectively aggregating them into a coherent global model. This adaptability ensures efficient utilization of client capabilities without additional computational overhead.
Dynamic Adaptation: The framework supports dynamic adaptation to clients' capabilities, ensuring stable and effective learning outcomes even as model heterogeneity evolves. This flexibility addresses practical deployment scenarios where client resources may fluctuate.
Enhanced FL Training Strategies: The authors introduce strategies to enhance the robustness of FL training, notably under balanced non-IID distributions. By employing thoughtful techniques, HeteroFL reduces the communication rounds required to match state-of-the-art results.

Methodological Details

Subnetwork Distribution

The authors develop a technique for selecting subsets of global model parameters to allocate across clients with differing capabilities. This is achieved by modulating the width of hidden layers, a choice aligned with reducing computational complexity. The approach involves:

Defining multiple levels of computational complexity, allowing clients to operate on adaptively sized parameter subsets.
Aggregating local models to form a global model using weighted averaging of parameter subsets, ensuring stability across diverse settings.

Static Batch Normalization (sBN)

To maintain privacy and efficiency, HeteroFL incorporates a modified batch normalization technique, sBN, which omits running statistics during training. This adaptation reduces communication costs and alleviates privacy concerns while retaining normalization benefits.

Scalability and Evaluation

Extensive empirical evaluation on MNIST, CIFAR10, and WikiText2 benchmarks demonstrates the framework's computational savings and efficacy across different architectures including CNNs and Transformers.

Practical Implications and Future Work

Practical Implications:

HeteroFL offers a practical solution for deploying federated learning across diverse client environments, such as mobile devices and IoT deployments, where resource constraints vary significantly.
The framework minimizes communication rounds and computational demands, which are critical factors in real-world federated learning applications.

Future Directions:

Exploring heterogeneity at the level of distinct model classes or integrating few-shot learning and multi-modal learning within federated settings could further enhance HeteroFL's applicability.
Addressing privacy concerns related to global batch normalization could open new avenues for secure federated learning implementations.

In conclusion, HeteroFL represents a significant advancement in federated learning, providing a flexible yet efficient approach to leverage heterogeneous client resources effectively. This work lays the groundwork for further exploration and optimization of model heterogeneity in federated systems.

PDF Markdown Bookmark Chat (Pro)

Authors (3)

Enmao Diao (25 papers)
Jie Ding (123 papers)
Vahid Tarokh (144 papers)

Citations (487)

View on Semantic Scholar