CryptoFair-FL: Secure & Fair Federated Learning
- The paper introduces CryptoFair-FL, a framework that integrates additively homomorphic encryption and secure multi-party computation to verify fairness metrics without revealing sensitive data.
- It employs a batched binary-tree aggregation protocol to reduce computational overhead, achieving up to 488× speedup over naïve methods while maintaining efficiency.
- Experiments demonstrate that CryptoFair-FL achieves an 86.6% reduction in demographic parity violation and robust defense against attribute inference attacks with only moderate cost increase.
CryptoFair-FL is a cryptographic framework designed to enable privacy-preserving federated learning (FL) with verifiable group fairness guarantees. It integrates additively homomorphic encryption and secure multi-party computation to support rigorous statistical verification of fairness metrics—specifically demographic parity and equalized odds—without disclosure of protected attribute distributions or individual predictions. The framework addresses both honest-but-curious and malicious adversaries, establishes formal information-theoretic lower bounds on privacy degradation, and reduces the computational complexity of fairness verification through a batched, binary-tree aggregation protocol, achieving practical deployment efficiency. Comprehensive experiments on heterogeneous federated datasets illustrate that CryptoFair-FL can meet regulatory fairness targets while defending against attribute inference and imposing only moderate computational overhead (Ali et al., 18 Jan 2026).
1. Framework Overview and Security Model
CryptoFair-FL supports collaborative training of models across distributed institutions, ensuring that neither raw data nor protected attributes are ever centralized. The primary objectives are: (i) privacy preservation for all participants, (ii) statistically verifiable group fairness guarantees under both cryptographic and differential privacy regimes, and (iii) computational and network demands not exceeding the baseline Federated Averaging (FedAvg) protocol.
The threat model encompasses:
- Honest-but-Curious: The central aggregator and up to participants follow the protocol but seek to infer sensitive information from protocol transcripts.
- Malicious Adversaries: Up to participants may deviate arbitrarily, including providing forged statistics. The protocol ensures that such misbehavior is detected with probability at least .
2. Formalization of Fairness Metrics
CryptoFair-FL focuses on two group fairness criteria in binary classification settings:
- Demographic Parity Violation:
where and .
- Equalized Odds Violation:
Both notions require securely aggregating predictions and protected attribute counts without revealing local or aggregate statistics.
3. Cryptographic and Differential Privacy Foundations
Additively Homomorphic Encryption (AHE)
CryptoFair-FL uses the Paillier cryptosystem with 2048-bit modulus, supporting operations:
- For plaintexts ,
- Semantic security relies on the Decisional Composite Residuosity assumption.
Secure Multi-Party Computation (MPC) Components
- Threshold Decryption: Private decryption key is -way shared; any -subset can jointly decrypt.
- Zero-Knowledge Range Proofs: Each party commits to its local count using a Pedersen commitment , providing a proof that .
- Aggregate Verification: The aggregator multiplies commitments and validates all range proofs, aborting on any failure.
-Differential Privacy
A randomization mechanism is -DP if, for all adjacent datasets and all measurable ,
4. Batched Binary-Tree Fairness Verification Protocol
Naïve aggregation of encrypted statistics incurs cost. CryptoFair-FL reduces this to using a binary-tree batching protocol, in which local encrypted counts are first summed within small batches, and these batch sums are then recursively homomorphically aggregated in a tree structure.
Protocol outline:
- Partition participants into batches.
- Homomorphically sum local ciphertexts within each batch.
- Recursively aggregate batch ciphertexts in levels, generating zero-knowledge pairing proofs at each node.
- If any proof fails, the protocol is aborted.
The aggregation step at each tree node is:
The process requires communication rounds.
5. Privacy and Security Guarantees
Differential Privacy Parameters
Theorem 6 specifies that, for rounds of fairness verification, noise scale per party, and institutions:
Local Laplace noise (with scale ) is added to each count, yielding -DP locally. Aggregation and composition preserve -DP globally. For recommended parameter choices (, ), privacy loss remains near optimal.
Information-Theoretic Lower Bounds
Theorem 3 establishes that any mechanism aiming to verify with additive tolerance must satisfy:
where is the record count for protected-attribute value . This lower bound is obtained by reduction to distinguishing adjacent datasets via hypothesis testing.
Malicious and Honest-but-Curious Adversary Defense
Zero-knowledge range proofs and threshold decryption address malicious misreporting and protect against collusion. Differential privacy, combined with cryptographically protected aggregation, renders successful attribute inference infeasible, with adversarial success rates empirically reduced to near random guess ( advantage).
6. Experimental Evaluation
Datasets
- MIMIC-IV (30 hospitals), mortality classification, protected attribute: race.
- Adult Income (50 institutions), protected: sex.
- CelebA (40 participants), protected: gender and age.
- FedFair-100: synthetic, 100 institutions, census-calibrated heterogeneity.
Results
| Protocol | (MIMIC-IV) | AUROC (MIMIC-IV) | Overhead |
|---|---|---|---|
| FedAvg (standard) | 0.231 | 0.868 | |
| CryptoFair-FL | 0.031 | 0.857 |
- CryptoFair-FL reduces demographic parity violation from $0.231$ to $0.031$ (an reduction).
- AUROC remains within $0.011$ of the centralized baseline.
- Communication/compute cost increases by vs. FedAvg.
- Batched verification yields up to speedup versus naïve HE at .
- Attribute inference attack success drops from $0.72$–$0.81$ to $0.48$–$0.53$ with CryptoFair-FL (approaching the $0.50$ baseline for random guessing).
7. Privacy–Fairness Tradeoff and Applicability
Residual fairness error scales as , as formalized in Theorem 7:
Empirical measurements align closely with this bound across all datasets. In the 30-hospital MIMIC-IV setting, falls below the $0.05$ regulatory target within 60 communication rounds while AUROC remains above 0.85. CryptoFair-FL thus achieves near-optimal privacy-fairness efficiency within of the theoretical lower bound.
The integration of statistical fairness verification, cryptographic privacy, and practical aggregation efficiency makes CryptoFair-FL particularly suitable for regulated, high-stakes collaborative learning scenarios such as healthcare and finance, where legal mandates require both privacy and accountability (Ali et al., 18 Jan 2026).