Papers
Topics
Authors
Recent
2000 character limit reached

Decentralized Federated Learning

Updated 28 December 2025
  • Decentralized Federated Learning is a distributed ML paradigm where clients collaboratively train models on private data to enhance privacy and resilience.
  • It leverages diverse network topologies, such as mesh, ring, and hierarchical, to optimize convergence rates and reduce communication overhead.
  • Advanced protocols like decentralized SGD, consensus ADMM, and robust aggregation ensure scalability, security, and effective handling of data heterogeneity.

Decentralized Federated Learning (DFL) is a distributed machine learning paradigm in which clients collaboratively train models using private data, exchanging knowledge directly in a peer-to-peer fashion or via distributed ledgers, thus eliminating the need for a centralized server. This design mitigates single-point-of-failure risks, enhances system resilience, and strengthens privacy guarantees by obviating the central orchestrator present in classical federated learning. DFL encompasses diverse architectural, algorithmic, and security innovations, and is increasingly employed across edge computing, industry, healthcare, and other privacy-sensitive domains (Yuan et al., 2023, Gabrielli et al., 2023, Hallaji et al., 25 Jan 2024).

1. Fundamental Models and Topologies

DFL formalizes collaborative optimization as a consensus problem over a graph. Each of NN clients holds local data and maintains a model wiw_i. The global objective is typically

minw1,,wN  i=1NpiFi(wi)s.t.wi=wj    (i,j)E,\min_{w_1,\dots,w_N} \; \sum_{i=1}^N p_i F_i(w_i) \quad \text{s.t.}\quad w_i = w_j \;\; \forall (i,j) \in E,

where pip_i are aggregation weights, FiF_i is the local loss, and G=(V,E)G=(V,E) is the physical or logical network topology (Yuan et al., 2023, Gabrielli et al., 2023).

Topological structures include:

  • Fully connected (mesh): Rapid consensus but high per-round communication.
  • Ring, line, star: Lower degree, tradeoff in convergence rate vs. efficiency.
  • Dynamic/random graphs: Adapt to node churn, mobility, or bandwidth constraints.
  • Hierarchical/multi-level: Aggregators coordinate subgroups, combining DFL with clustered FL (Yuan et al., 2023).

Topological choice governs the convergence rate, fault tolerance, and communication complexity, with spectral properties of the mixing matrix WW (derived from GG) controlling error decay (O(σk)O(\sigma^k) where σ\sigma is its second largest eigenvalue) (Yuan et al., 2023).

2. Core Algorithms and Protocols

Canonical DFL adopts local training on private data, followed by model aggregation across neighbors. Common procedures include:

  • Decentralized SGD (DSGD):

wik+1/2=jWijwjk,wik+1=wik+1/2ηFi(wik)w_i^{k+1/2} = \sum_{j} W_{ij} w_j^k,\quad w_i^{k+1} = w_i^{k+1/2} - \eta \nabla F_i(w_i^k)

  • Consensus ADMM: Primal-dual optimization over connected graphs, supporting more general objective constraining (Yuan et al., 2023, Gabrielli et al., 2023).
  • Gossip Averaging: Each node exchanges and mixes parameters with a random neighbor.
  • Gradient Tracking: Nodes maintain surrogate gradient states to correct bias introduced by heterogeneity (yiky_i^k tracks jFj\sum_j \nabla F_j) (Gao et al., 2023).
  • Push-Sum on Directed Graphs: De-biased parameter sharing for asymmetric (column-stochastic) network structures (Li et al., 2023).
  • Event-triggered protocols: Clients transmit updates only upon significant local changes or resource-awareness, reducing communication (Zehtabi et al., 2022).

Aggregation rules (median, trimmed mean, Krum, FedProx, Zeno, MultiKRUM) are employed to mitigate Byzantine attacks and to generalize FedAvg beyond centralized settings (Beltrán et al., 2023).

3. Communication, Scalability, and Efficiency

DFL aims to suppress communication bottlenecks of centralized FL. Critical factors include:

  • Bandwidth utilization: Decentralized segmented gossip enables model parallelism across diverse links, ensuring node bandwidth saturation and linear speedups relative to centralized protocols (Hu et al., 2019).
  • Synchronization schemes: Synchronous rounds can incur straggler latency. Asynchronous and event-triggered protocols increase overall efficiency, at the cost of more intricate convergence analysis (Zehtabi et al., 2022, Yuan et al., 2023).
  • Communication cost: Peer-to-peer and push-sum schemes reduce effective per-round data transfer to O(logK)O(\log K) per client for proxy models, and O(degree)O(\text{degree}) for dense models, as compared to O(K)O(K) in server-centric FL (Kalra et al., 2021, Hu et al., 2019, Gabrielli et al., 2023).
  • Overhead: Blockchain-based approaches add transaction and consensus costs, which may become significant at scale but confer auditability and incentive compatibility (Zhang et al., 2023, Ghanem et al., 2021, ChaoQun, 2022, S et al., 26 Apr 2025).

4. Security, Trust, and Privacy Mechanisms

Decentralization both mitigates and introduces new attack surfaces. Key mechanisms include:

Attack models include Byzantine (malicious) participants, honest-but-curious adversaries, and consensus attacks on blockchain, with analytical bounds on subversion and defense success probabilities (Hallaji et al., 25 Jan 2024).

5. Model and Data Heterogeneity

DFL frameworks accommodate heterogeneity in both model architectures and data distributions:

  • Mutual learning/distillation: Clients exchange probabilistic knowledge or logits via knowledge transfer, supporting arbitrary architectures and robust adaptation under severe non-IID splits (Khalil et al., 2 Feb 2024, Wittkopp et al., 2021).
  • Proxy/teacher-student protocols: Models share synthetic representations, not parameters, yielding strong privacy and fast cold-start adaptation (Wittkopp et al., 2021, Kalra et al., 2021).
  • Personalized aggregation: Clients select neighbors dynamically or tailor aggregation weights via game-theoretic or scoring approaches, optimizing individual prediction performance (Behera et al., 5 Oct 2024).
  • Zero-shot decentralized FL: Prompt-sharing between clients enables adaptation for large vision-LLMs with dramatically reduced communication (\sim118×\times improvement) and competitive generalization compared to centralized prompt aggregation (Masano et al., 30 Sep 2025).

6. Theoretical Analysis and Empirical Results

Theoretical guarantees in DFL depend on network structure, update algorithms, and loss function convexity:

7. Open Challenges and Research Directions

DFL presents several unresolved issues and directions for future work:

DFL represents a highly technical, rapidly evolving research frontier with a rich interplay between distributed optimization, security, privacy, and large-scale deployment considerations. The surveyed works outline the concrete progress in decentralized architectures, robust aggregation, privacy-enhancement, blockchain integration, and flexible, heterogeneous model learning (Yuan et al., 2023, Zhang et al., 2023, Gabrielli et al., 2023, Hallaji et al., 25 Jan 2024, Yuan et al., 2021, ChaoQun, 2022, Hu et al., 2019, Masano et al., 30 Sep 2025).

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Decentralized Federated Learning.