Byzantine-Fault-Tolerant Federated Learning

Updated 23 November 2025

Byzantine-fault-tolerant federated learning is a method that mitigates arbitrary adversarial updates by robustly aggregating model parameters from distributed clients.
It leverages techniques like comparative elimination, layerwise cosine aggregation, and consensus-based optimization to maintain close approximation to the honest centroid.
The approach balances theoretical limits and practical trade-offs using validity constraints, weight manipulation, and hybrid defenses to ensure reliable performance under coordinated attacks.

Byzantine-fault-tolerant federated learning (BFT-FL) addresses distributed model training in adversarial environments where an unknown subset of clients, termed Byzantine, can send arbitrary (possibly coordinated) updates. Unlike benign failures (crashes, message loss), Byzantine faults admit arbitrary manipulations—model poisoning, gradient inversion, adversarial perturbations, or adaptive spoofing—and necessitate algorithmic defenses that provably prevent catastrophic degradation of global model quality or convergence.

1. Core Definitions and Threat Model

In BFT-FL, $n$ distributed clients participate in collaborative model training. At each round, each client $i$ communicates a $d$ -dimensional vector $v_i \in \mathbb{R}^d$ (representing a gradient or model delta) to a central server or a peer-to-peer overlay. Among these, up to $t < n/3$ ( $n/2$ or lower, depending on the aggregation protocol and validity conditions) may be Byzantine—capable of sending vectors of their own choosing, with full knowledge of system state and/or collusion with other adversaries.

The canonical benchmark is model aggregation in federated averaging or stochastic gradient descent, where the goal is to compute an output $O_\mathcal{A}$ that closely approximates the centroid (mean) of the $n-t$ honest vectors,

$\bar v = \frac{1}{n-t} \sum_{i \in F} v_i$

for $F$ the (unknown) honest set, in the presence of arbitrary Byzantine $v_j$ .

Security in BFT-FL is thus formalized as bounding the deviation between $O_\mathcal{A}$ and $\bar v$ under various constraints. The approximation factor is most rigorously measured relative to the minimal covering radius of the set of plausible centroids: $\alpha(L) = \frac{\|O_\mathcal{A} - \bar v\|_2}{r_\mathcal{B}(L)}\,,$ where $r_\mathcal{B}(L)$ is the radius of the minimum covering ball over centroids that might arise under all possible $t$ -Byzantine subsets.

2. Validity Constraints and Theoretical Limits

The robustness of aggregation in BFT-FL fundamentally hinges on which validity constraints are imposed:

Weak validity: $\mathcal{A}$ outputs $v$ if all $n$ inputs coincide at $v$ .
Strong validity: $\mathcal{A}$ outputs $v$ if all $n-t$ honest inputs coincide at $v$ .
Box validity: $O_\mathcal{A}$ must lie inside the smallest axis-parallel box containing all honest vectors.
Convex validity: $O_\mathcal{A}$ must be in the convex hull of the honest vectors.

The main trade-offs are as follows (Cambus et al., 18 Jun 2025):

Validity class	Achievable $\alpha$ (approximation)	Lower bound	Upper bound	Parameters
Weak (trivial)	$\alpha = 1$	1	1	any $n > t$
Strong (MDA rule)	$\alpha = 2$ (tight)	2	2	$n > 2t$
Box	$\Omega(\min\{(n-t)/t,\,\sqrt{d}\})$	As stated	$2\sqrt{\min\{n,d\}}$	$n > 3t$
Convex	$\Theta(\sqrt{d})$	$\sqrt{2d}$	$\sqrt{2d}$ (tight)	$n > (d+1)t$

Box and convex validity conditions prevent pathological but theoretically optimal attacks. Under box validity, no algorithm can guarantee better than $\min\{(n-t)/t, \sqrt{d}\}$ -approximation, due to scenarios where Byzantine outliers collapse the axis-aligned bounding box, yet the honest centroid is far outside it in high-dimensional settings.
Under convex validity and $n>(d+1)t$ , a new polynomial-time algorithm achieves optimal $\sqrt{2d}$ -approximation by projecting onto the intersection of all convex hulls of $n-t$ subsets, tightly matching the lower bound (Cambus et al., 18 Jun 2025).

These limits are not simply theoretical; empirical experiments confirm that box-valid aggregation dominates in fault tolerance when data or model distributions are significantly heterogeneous among clients.

3. Algorithmic Constructions for BFT-FL

3.1 Comparative Elimination and Trimmed Mean

The comparative elimination (CE) filter is a canonical aggregation method that selects the $n-f$ updates closest to the current global state and averages them—effectively excising outliers (Dutta et al., 2023, Gupta et al., 2021). For inputs $x^i_{k,T}$ , sorted by proximity to $\bar x_k$ , keep the $N-f$ closest and set

$\bar x_{k+1} = \frac{1}{N-f} \sum_{i \in \mathcal{F}_k} x^i_{k,T}$

This can be seen as a one-round implementation of box-validity or as a per-coordinate trimmed mean.

3.2 High-Dimensional and Layerwise Aggregation

Standard robust aggregators (Krum, Bulyan, geometric median) perform poorly in high-dimensional settings due to the curse of dimensionality—malicious "spikes" can evade detection via small perturbations. Layerwise cosine aggregation addresses this: the model is partitioned by layers, each layer's weights are median-clipped, normalized, and aggregated using base robust rules with cosine-distance replacing Euclidean distance (García-Márquez et al., 27 Mar 2025). This decouples resilience per-layer, yielding a smaller effective approximation constant and improved accuracy under strong adversarial attacks and imbalanced parameter dimensions.

3.3 Distributed Optimization Approaches

Rather than defensively aggregating, distributed optimization via consensus protocols (e.g., Primal-Dual Method of Multipliers, PDMM) introduces auxiliary consensus variables and two-round primal-dual updates that "pull" all client parameters towards each other simultaneously (Xia et al., 13 Mar 2025). These methods inherit inherent Byzantine fault tolerance: adversarial updates are corrected by the consensus mechanism, and no explicit outlier filtering is needed. Empirically, PDMM achieves substantial accuracy and stability improvements over FedAvg and matches or exceeds robust aggregation-based FL under attacks.

3.4 Asynchronous and Over-the-Air Aggregation

BFT-FL has been extended to asynchronous protocols: servers update after receiving $2f+1$ updates computed on the latest global model, using clustering to excise malicious or outlier updates (Catalyst) (Cox et al., 3 Jun 2024). Stragglers are incorporated with a staleness factor. For wireless FL, communication-efficient schemes use over-the-air computation (AirComp), combining FL with geometric median aggregation (Weiszfeld algorithm) to achieve robust and bandwidth-efficient model fusion (Huang et al., 2021, Sifaou et al., 2022).

4. Practical Trade-offs and Experimental Benchmarks

Experimental studies benchmark BFT-FL algorithms under multiple Byzantine scenarios:

Attack models: sign-flipping, label flipping, model replacement, Gaussian and parameter-space perturbations, collusion, adaptive attacks, and backdoors.
Data regimes: homogeneous (IID), mildly heterogeneous and extremely non-IID.
Aggregation rules compared: covering-ball center, MDA, box-valid aggregation, multi-Krum, trimmed mean, coordinate-wise median, geometric median, layerwise-cosine strategies, consensus-based distributed optimization, holdout-SGD.

Findings (Cambus et al., 18 Jun 2025, García-Márquez et al., 27 Mar 2025, Xia et al., 13 Mar 2025, Yue et al., 10 Sep 2024):

Box-valid aggregation outperforms strong and weak validity rules under data heterogeneity and model replacement attacks; achieves $\approx 75\%$ accuracy at $f=2$ in highly non-IID data, whereas centroid-based or strong-validity methods quickly collapse.
Layerwise/cosine approaches achieve up to $16$ point accuracy improvements compared to baseline robust aggregators in high-dimensional or deep network settings, recovering nearly all lost accuracy under coordinated attacks for complex models.
Consensus-based distributed optimization (PDMM) yields gains of $+37$ to $+56$ accuracy points over FedAvg under bit-flipping or Gaussian noise attacks and also reduces inter-run variance by up to $50\%$ (Xia et al., 13 Mar 2025).

5. Extensions, Open Problems, and Methodological Challenges

Despite progress, BFT-FL faces fundamental and practical challenges:

Weight manipulation and incentive compatibility: Standard FedAvg uses self-reported client weights (proportional to data volume) to improve statistical efficiency. Byzantine clients can manipulate their weights to subvert any aggregation rule. Weight-truncation preprocessing—capping each client's weight so that no coalition can control more than the fault-tolerant share—restores the information-theoretic guarantees of any underlying robust rule (Portnoy et al., 2020, Shi et al., 2021).
Majority attacker and data heterogeneity: Some schemes (e.g., Robust-FL) show that combining trend-prediction of the global model via exponential smoothing with clustering-based acceptance can tolerate majority Byzantine participants, even under extreme non-IID splits (Li et al., 2022). However, this assumes benign global model evolution and may be defeated by stealthy attacks.
Complex attacks and hybrid defenses: New attacks (TrapSetter, orthogonally-perturbed adversaries) adaptively match the aggregator's defenses or exploit degree-of-freedom not addressed in standard rules. No single defense algorithm dominates. Hybrid aggregation—dynamic per-round selection of defense rules via held-out validation—can improve worst-case resilience but introduces new requirements (validation set selection) and overhead (Yue et al., 10 Sep 2024).
Efficient privacy and security: Achieving Byzantine resilience jointly with privacy remains a challenge. Efficient protocols combine similarity-based validation and noninteractive zero-knowledge proofs, or information-theoretic secret sharing with trust-weighted aggregation, to defend simultaneously against adversaries and data leakage (Nie et al., 29 Jul 2024, Xia et al., 14 May 2024).

6. Methodological and Theoretical Summary

The key trade-offs in BFT-FL are quantified by the interplay between the redundancy threshold ( $n > ct$ for various $c$ ), model dimension $d$ , chosen validity region (box, convex, full space), and the target approximation factor $\alpha$ :

Best possible $\alpha$ for box-validity is $2\sqrt{\min\{n,d\}}$ , for convex is $\sqrt{2d}$ .
Tightness: the stated bounds are shown to be optimal up to constant factors for all $n, d, t$ above the feasibility threshold.
In distributed or peer-to-peer settings, the same worst-case bounds and algorithms apply, since the aggregation reduction does not exploit central server structure.

No rule obviates the impact of the classic limiting factors: as $d$ increases or $t$ approaches $n/3$ (or appropriate validity threshold), worst-case approximation must scale accordingly. However, for practical $d$ and nonadversarial settings, computationally scalable and communication-efficient aggregation schemes, augmented by dynamic hybridization and local validation, remain empirically effective.

7. References

Main theoretical centroids and approximation: "Centroid Approximation for Byzantine-Tolerant Federated Learning" (Cambus et al., 18 Jun 2025).
Nonconvex local SGD with comparative elimination: (Dutta et al., 2023, Gupta et al., 2021).
Layerwise/cosine aggregation: (García-Márquez et al., 27 Mar 2025).
Distributed optimization with PDMM: (Xia et al., 13 Mar 2025).
Asynchronous BFT-FL (Catalyst): (Cox et al., 3 Jun 2024).
Hybrid defenses and adaptive attack strategies: (Yue et al., 10 Sep 2024, Ozfatura et al., 2022).
Robust-FL with predictor-based filtering: (Li et al., 2022).
Weight manipulation and truncation: (Portnoy et al., 2020, Shi et al., 2021).
Information-theoretic privacy and Byzantine robustness: (Nie et al., 29 Jul 2024, Xia et al., 14 May 2024).
Over-the-air geometric median aggregation: (Huang et al., 2021, Sifaou et al., 2022).
Comprehensive taxonomies and challenges: (Shi et al., 2021).