Papers
Topics
Authors
Recent
Search
2000 character limit reached

CryptoFair-FL: Secure & Fair Federated Learning

Updated 25 January 2026
  • The paper introduces CryptoFair-FL, a framework that integrates additively homomorphic encryption and secure multi-party computation to verify fairness metrics without revealing sensitive data.
  • It employs a batched binary-tree aggregation protocol to reduce computational overhead, achieving up to 488× speedup over naïve methods while maintaining efficiency.
  • Experiments demonstrate that CryptoFair-FL achieves an 86.6% reduction in demographic parity violation and robust defense against attribute inference attacks with only moderate cost increase.

CryptoFair-FL is a cryptographic framework designed to enable privacy-preserving federated learning (FL) with verifiable group fairness guarantees. It integrates additively homomorphic encryption and secure multi-party computation to support rigorous statistical verification of fairness metrics—specifically demographic parity and equalized odds—without disclosure of protected attribute distributions or individual predictions. The framework addresses both honest-but-curious and malicious adversaries, establishes formal information-theoretic lower bounds on privacy degradation, and reduces the computational complexity of fairness verification through a batched, binary-tree aggregation protocol, achieving practical deployment efficiency. Comprehensive experiments on heterogeneous federated datasets illustrate that CryptoFair-FL can meet regulatory fairness targets while defending against attribute inference and imposing only moderate computational overhead (Ali et al., 18 Jan 2026).

1. Framework Overview and Security Model

CryptoFair-FL supports collaborative training of models across nn distributed institutions, ensuring that neither raw data nor protected attributes are ever centralized. The primary objectives are: (i) privacy preservation for all participants, (ii) statistically verifiable group fairness guarantees under both cryptographic and differential privacy regimes, and (iii) computational and network demands not exceeding 2.3×2.3\times the baseline Federated Averaging (FedAvg) protocol.

The threat model encompasses:

  • Honest-but-Curious: The central aggregator and up to t<n/2t < n/2 participants follow the protocol but seek to infer sensitive information from protocol transcripts.
  • Malicious Adversaries: Up to t<n/3t < n/3 participants may deviate arbitrarily, including providing forged statistics. The protocol ensures that such misbehavior is detected with probability at least 1δdetect1-\delta_{\text{detect}}.

2. Formalization of Fairness Metrics

CryptoFair-FL focuses on two group fairness criteria in binary classification settings:

  • Demographic Parity Violation:

ΔDP(θ)=Pr[Y^=1A=0]Pr[Y^=1A=1]\Delta_{\mathrm{DP}}(\theta) = \left|\,\Pr[\hat Y=1 \mid A=0] - \Pr[\hat Y=1 \mid A=1]\right|

where Y^=1[fθ(X)>0.5]\hat Y = \mathbf{1}[f_\theta(X) > 0.5] and A{0,1}A \in \{0,1\}.

  • Equalized Odds Violation:

ΔEO(θ)=maxy{0,1}Pr[Y^=1A=0,Y=y]Pr[Y^=1A=1,Y=y]\Delta_{\mathrm{EO}}(\theta) = \max_{y\in\{0,1\}} \left| \Pr[\hat Y=1 \mid A=0, Y=y] - \Pr[\hat Y=1 \mid A=1, Y=y]\right|

Both notions require securely aggregating predictions and protected attribute counts without revealing local or aggregate statistics.

3. Cryptographic and Differential Privacy Foundations

Additively Homomorphic Encryption (AHE)

CryptoFair-FL uses the Paillier cryptosystem with 2048-bit modulus, supporting operations:

  • For plaintexts m1,m2m_1, m_2,

Decsk(Encpk(m1)    Encpk(m2))=m1+m2\operatorname{Dec}_{sk}(\operatorname{Enc}_{pk}(m_1)\;\odot\;\operatorname{Enc}_{pk}(m_2)) = m_1 + m_2

  • Semantic security relies on the Decisional Composite Residuosity assumption.

Secure Multi-Party Computation (MPC) Components

  • Threshold Decryption: Private decryption key is nn-way shared; any kk-subset can jointly decrypt.
  • Zero-Knowledge Range Proofs: Each party ii commits to its local count sis_i using a Pedersen commitment Comi=gsihri\mathsf{Com}_i = g^{s_i}h^{r_i}, providing a proof that si[0,mi]s_i \in [0, m_i].
  • Aggregate Verification: The aggregator multiplies commitments and validates all range proofs, aborting on any failure.

(ε,δ)(\varepsilon, \delta)-Differential Privacy

A randomization mechanism M\mathcal{M} is (ε,δ)(\varepsilon, \delta)-DP if, for all adjacent datasets D,DD,D' and all measurable SS,

Pr[M(D)S]eεPr[M(D)S]+δ.\Pr[\mathcal{M}(D)\in S] \leq e^\varepsilon\,\Pr[\mathcal{M}(D')\in S] + \delta.

4. Batched Binary-Tree Fairness Verification Protocol

Naïve aggregation of nn encrypted statistics incurs O(n2)O(n^2) cost. CryptoFair-FL reduces this to O(nlogn)O(n\log n) using a binary-tree batching protocol, in which local encrypted counts are first summed within small batches, and these batch sums are then recursively homomorphically aggregated in a tree structure.

Protocol outline:

  • Partition nn participants into n/B\lceil n/B\rceil batches.
  • Homomorphically sum local ciphertexts within each batch.
  • Recursively aggregate batch ciphertexts in log2(n/B)\lceil\log_2(n/B)\rceil levels, generating zero-knowledge pairing proofs at each node.
  • If any proof fails, the protocol is aborted.

The aggregation step at each tree node is:

Cj()=C2j1(1)C2j(1)C_j^{(\ell)} = C_{2j-1}^{(\ell-1)} \odot C_{2j}^{(\ell-1)}

The process requires O(logn)O(\log n) communication rounds.

5. Privacy and Security Guarantees

Differential Privacy Parameters

Theorem 6 specifies that, for TT rounds of fairness verification, noise scale σ\sigma per party, and nn institutions:

ε=42Tln(2/δ)σn+4T(σn)2,δ=106\varepsilon = \frac{4\sqrt{2T\ln (2/\delta)}}{\sigma n} + \frac{4T}{(\sigma n)^2}, \quad \delta = 10^{-6}

Local Laplace noise (with scale 1/ε01/\varepsilon_0) is added to each count, yielding (ε0,0)(\varepsilon_0,0)-DP locally. Aggregation and composition preserve (ε,δ)(\varepsilon, \delta)-DP globally. For recommended parameter choices (ε=0.5\varepsilon=0.5, δ=106\delta=10^{-6}), privacy loss remains near optimal.

Information-Theoretic Lower Bounds

Theorem 3 establishes that any mechanism aiming to verify ΔDP\Delta_{\mathrm{DP}} with additive tolerance τ\tau must satisfy:

ε2τmin{n0,n1}\varepsilon \geq \frac{2}{\tau\,\min\{n_0, n_1\}}

where nan_a is the record count for protected-attribute value aa. This lower bound is obtained by reduction to distinguishing adjacent datasets via hypothesis testing.

Malicious and Honest-but-Curious Adversary Defense

Zero-knowledge range proofs and threshold decryption address malicious misreporting and protect against collusion. Differential privacy, combined with cryptographically protected aggregation, renders successful attribute inference infeasible, with adversarial success rates empirically reduced to near random guess (<0.05<0.05 advantage).

6. Experimental Evaluation

Datasets

  • MIMIC-IV (30 hospitals), mortality classification, protected attribute: race.
  • Adult Income (50 institutions), protected: sex.
  • CelebA (40 participants), protected: gender and age.
  • FedFair-100: synthetic, 100 institutions, census-calibrated heterogeneity.

Results

Protocol ΔDP\Delta_{\mathrm{DP}} (MIMIC-IV) AUROC (MIMIC-IV) Overhead
FedAvg (standard) 0.231 \approx 0.868 1×1\times
CryptoFair-FL 0.031 \approx 0.857 2.3×2.3\times
  • CryptoFair-FL reduces demographic parity violation from $0.231$ to $0.031$ (an 86.6%86.6\% reduction).
  • AUROC remains within $0.011$ of the centralized baseline.
  • Communication/compute cost increases by 2.3×2.3\times vs. FedAvg.
  • Batched verification yields up to 488×488\times speedup versus naïve HE at n=100n=100.
  • Attribute inference attack success drops from $0.72$–$0.81$ to $0.48$–$0.53$ with CryptoFair-FL (approaching the $0.50$ baseline for random guessing).

7. Privacy–Fairness Tradeoff and Applicability

Residual fairness error scales as O(1/ε)O(1/\varepsilon), as formalized in Theorem 7:

ΔDPCε,C0.096\Delta_{\mathrm{DP}} \approx \frac{C}{\varepsilon}, \quad C\approx 0.096

Empirical measurements align closely with this bound across all datasets. In the 30-hospital MIMIC-IV setting, ΔDP\Delta_{\mathrm{DP}} falls below the $0.05$ regulatory target within 60 communication rounds while AUROC remains above 0.85. CryptoFair-FL thus achieves near-optimal privacy-fairness efficiency within 20%20\% of the theoretical lower bound.

The integration of statistical fairness verification, cryptographic privacy, and practical aggregation efficiency makes CryptoFair-FL particularly suitable for regulated, high-stakes collaborative learning scenarios such as healthcare and finance, where legal mandates require both privacy and accountability (Ali et al., 18 Jan 2026).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to CryptoFair-FL.