Node Selection Toward Faster Convergence for Federated Learning on Non-IID Data (2105.07066v3)

Published 14 May 2021 in cs.LG and cs.AI

Abstract: Federated Learning (FL) is a distributed learning paradigm that enables a large number of resource-limited nodes to collaboratively train a model without data sharing. The non-independent-and-identically-distributed (non-i.i.d.) data samples invoke discrepancies between the global and local objectives, making the FL model slow to converge. In this paper, we proposed Optimal Aggregation algorithm for better aggregation, which finds out the optimal subset of local updates of participating nodes in each global round, by identifying and excluding the adverse local updates via checking the relationship between the local gradient and the global gradient. Then, we proposed a Probabilistic Node Selection framework (FedPNS) to dynamically change the probability for each node to be selected based on the output of Optimal Aggregation. FedPNS can preferentially select nodes that propel faster model convergence. The unbiasedness of the proposed FedPNS design is illustrated and the convergence rate improvement of FedPNS over the commonly adopted Federated Averaging (FedAvg) algorithm is analyzed theoretically. Experimental results demonstrate the effectiveness of FedPNS in accelerating the FL convergence rate, as compared to FedAvg with random node selection.

Citations (118)

View on Semantic Scholar

Summary

The paper presents a novel FedPNS framework that uses an Optimal Aggregation algorithm to exclude adverse local updates, thereby accelerating convergence in Federated Learning.
It employs a Probabilistic Node Selection approach to dynamically adjust participation based on gradient alignment with the global model.
Theoretical and empirical analyses demonstrate that FedPNS outperforms conventional FedAvg in accuracy and training speed on heterogeneous datasets.

Node Selection Toward Faster Convergence for Federated Learning on Non-IID Data

This paper focuses on improving the convergence rate of Federated Learning (FL) when dealing with non-independently and identically distributed (non-i.i.d.) data. The authors propose a novel approach consisting of an Optimal Aggregation algorithm and a Probabilistic Node Selection framework, called FedPNS, to address the challenges posed by data heterogeneity in FL.

Overview of Federated Learning and Challenges

Federated Learning is a decentralized machine learning approach that enables multiple nodes to collaboratively train a model without sharing their local data. However, the discrepancy between local and global objectives due to non-i.i.d. data slows down convergence. Each node in FL collects data that may have different distributions, which complicates the global model training and invokes additional communication rounds.

Proposed Solution: FedPNS

Optimal Aggregation Algorithm

The authors propose an Optimal Aggregation algorithm, aiming to enhance the aggregation process in FL. This algorithm identifies and excludes adverse local updates based on the relationship between local gradients and the global gradient. Particularly, it evaluates the inner product of these gradients and excludes nodes whose updates negatively impact the global model. This selective aggregation increases the expected decrement of FL loss per round, facilitating faster convergence.

Probabilistic Node Selection Framework

FedPNS dynamically adjusts the selection probability of nodes based on their contribution to the global model, determined using the Optimal Aggregation results. This approach involves preferentially selecting nodes that are more aligned with the global objective, thus improving convergence speed compared to the conventional random selection method used in FedAvg.

Theoretical and Empirical Analyses

The paper provides a theoretical analysis to demonstrate the convergence rate improvements of FedPNS over the standard FedAvg algorithm. FedPNS strategically reduces weight divergence and considers data heterogeneity in node selection, leading to tighter convergence bounds. Empirical evaluations are conducted on synthetic and real datasets (MNIST and CIFAR-10) using models (MLR and CNN). The experiments show the effectiveness of FedPNS, particularly in environments with high data heterogeneity, achieving higher model accuracy and faster convergence.

Implications and Future Prospects

The proposed FedPNS framework offers significant implications for practical FL deployments, where non-i.i.d. data is prevalent. Improved convergence directly translates to reduced communication overhead and faster training times. Looking ahead, incorporating FedPNS in systems requiring consensus learning across heterogeneous nodes could substantially enhance privacy-preserving machine learning capabilities. Future developments in AI could build upon this framework by exploring adaptive learning rates or integrating additional node selection criteria based on real-time data analytics.

In conclusion, the paper provides a comprehensive solution to the non-i.i.d. data challenge in FL, offering theoretical insights and practical implementations to improve learning efficiency and system throughput in decentralized networks.

PDF Markdown

Related Papers

GitHub

GitHub - HongdaWu1226/FedPNS (30 stars)