Secure Aggregation Protocols
- Secure aggregation protocols are cryptographic methods that aggregate private values (e.g., sums or means) while keeping individual inputs confidential.
- They employ techniques such as masking, homomorphic encryption, and secret sharing, which are vital in federated learning and distributed analytics.
- Protocols incorporate dropout resilience, collusion resistance, and may use hardware-assisted approaches to achieve efficient and secure aggregation.
Secure aggregation protocols are cryptographic primitives that enable a server to compute the aggregate (such as the sum, mean, or linear function) of private values held by a set of mutually distrusting clients, without learning anything about any individual input beyond what is revealed by the aggregate itself. These protocols are critical in privacy-preserving federated learning (FL), distributed data analytics, and wireless sensor networks, where model updates or sensor readings must be combined without exposing individual contributions.
1. Core Cryptographic Principles and Threat Models
Secure aggregation protocols operate in diverse adversarial models and network settings. The foundational threat models are:
- Honest-but-curious (semi-honest): Parties (server and/or clients) follow the protocol but attempt to infer unauthorized information from observed messages (Wang et al., 2024).
- Malicious adversary: Parties may arbitrarily deviate, inject, drop, or tamper with messages, possibly colluding with others (Wen et al., 19 May 2025). Some protocols additionally address adaptive corruptions—where the adversary can dynamically select clients to compromise during protocol execution (Kadhe et al., 2020).
- Dropout tolerance: Protocols are typically robust to a parameterized number of client dropouts per round.
Protocol security goals include confidentiality (input privacy), integrity (only correct aggregates allowed), collusion resistance (even combined information of corrupted server and up to t clients cannot breach honest users’ privacy), and robustness to both passive and active attacks.
2. Protocol Paradigms and Mechanisms
2.1 Masking-Based Protocols
The canonical structure is additive masking: each user’s input is masked with randomness shared or structured so that the sum of all masks cancels out in aggregation. Examples include:
- Pairwise mask sharing: Each user shares random masks with peers; masks cancel if all parties participate. Dropout-resilience is provided by explicit key-sharing or Shamir secret-sharing (Zhang et al., 2023).
- Homomorphic masking: Each input is combined with a mask so that . These masks may be established with the server via key agreement, Diffie-Hellman, or threshold cryptographic schemes (Zhang et al., 2023, Wang et al., 2024).
2.2 Homomorphic Encryption-Based Aggregation
Protocols can leverage threshold additively homomorphic encryption (AHE) or fully homomorphic encryption (FHE):
- DTAHE and FHE protocols: Each client encrypts data under a joint public key; the server aggregates ciphertexts homomorphically and then partial decryption is performed via threshold secret sharing (e.g., lattice-based schemes or EC ElGamal). This enables arbitrary linear or non-linear aggregation, not just summation (Laage et al., 11 Apr 2025, Tian et al., 2021, Yang et al., 2023).
2.3 Secret Sharing (Shamir, Fast Fourier, and Sharding)
Scalability is improved by secret sharing schemes that avoid a quadratic number of pairwise interactions:
- Shamir-based sharing: Each input is split into shares using polynomial interpolation; shares are distributed such that only qualified sets (threshold) can reconstruct the secret (Stevens et al., 2022).
- Group-based and sharded approaches: Input is split across multiple small groups (shards), each group aggregates its piece, and the global sum is reconstructed by the server; sublinear communication per client is achieved (Stevens et al., 2022).
- FFT-based (FastShare): Multi-secret sharing using Fourier structures enables linear encoding/decoding and efficient resilience to dropouts and corruptions (Kadhe et al., 2020).
2.4 Hybrid and Hardware-Assisted Approaches
To balance computational overhead with strong privacy:
- TEE-based hybrids: Secure aggregation and decryption occur inside Trusted Execution Environments (TEEs, such as Intel SGX enclaves), which process and aggregate encrypted contributions at near-native speed, often with remote attestation to establish trust (Laage et al., 11 Apr 2025).
- Homomorphic + hardware hybrids: Recent protocols exploit both cryptographic (MK-CKKS, ECDH) and hardware primitives for one-shot, non-interactive aggregation with constant per-user upload cost (Emmaka et al., 28 Nov 2025).
2.5 Shuffle Model and Differential Privacy
The shuffled model inserts a random shuffler between users and the aggregator:
- Invisibility cloak encoder: Each value is split into random-looking shares; a shuffler randomly permutes all shares before aggregation, providing both input privacy and, with calibrated noise, differential privacy (Ghazi et al., 2019).
3. Protocol Designs and Workflow
The typical workflow is composed of:
- Setup: Key exchanges, mask/seed sharing, or key generation in a distributed or centralized manner (Zhang et al., 2023, Laage et al., 11 Apr 2025).
- Input submission: Each client masks (or encrypts) their input and uploads the masked/ciphertext vector, often in a single round (Wang et al., 2024, Behnia et al., 2023, Emmaka et al., 28 Nov 2025).
- Dropout recovery: Protocols incorporate mechanisms such as Shamir thresholding to reconstruct masked values if users drop out before aggregation (Zhang et al., 2023, Kadhe et al., 2020).
- Aggregation and unmasking: The server aggregates the uploads (sum, linear function, or more general operation), removes global masks or decrypts the sum using input from surviving or threshold clients (Emmaka et al., 28 Nov 2025, Tian et al., 2021).
- Proof and verification (optionally): Some protocols enable clients to verify the integrity of the aggregation step via homomorphic commitments or pairing-based signatures (Behnia et al., 2023, Wen et al., 19 May 2025).
4. Scalability, Communication, and Efficiency
Protocol scalability and efficiency are characterized by:
| Protocol Family | Client Complexity | Server Complexity | Per-User Comm. | Dropout Handling |
|---|---|---|---|---|
| Pairwise Mask-based (SecAgg) | O(n) | O(n2) | O(n + d) | Explicit unmasking |
| Secret Sharing (ShardAgg) | O(log n) | O(n) | O(d log n) | Built-in, no recovery |
| Homomorphic (DTAHE, MK-CKKS) | O(1) | O(n) | O(d) | Threshold, optional |
| TEE-based | O(1) | O(n) | O(d) | N/A (hardware trust) |
| Shuffle-Model/Invisibility Cloak | polylog(n) | O(n) | polylog(n) | Not needed |
Notably, protocols with sublinear client communication (e.g., ShardAgg, FastSecAgg) allow aggregation at scales of 108 clients with individual communication to O(log n) peers (Stevens et al., 2022, Kadhe et al., 2020). One-shot aggregation with constant per-user cost independent of n is demonstrated in (Emmaka et al., 28 Nov 2025).
5. Robustness, Security Proofs, and Limitations
5.1 Security Guarantees
- Information-theoretic privacy: Secret-sharing based schemes provide perfect privacy against up to t colluding users (and/or server) (Stevens et al., 2022, Kadhe et al., 2020).
- Computational privacy: Protocols relying on DDH, CDH, RLWE, or IND-CPA security of underlying encryption/masking primitives (Tian et al., 2021, Zhang et al., 2023, Laage et al., 11 Apr 2025).
- Differential privacy: Combining secure aggregation with noise, as in the shuffled model or in stateful aggregation for DP-FTRL (Ball et al., 2024, Ghazi et al., 2019).
5.2 Integrity and Verifiability
Certain systems offer explicit proof that the server cannot forge aggregation results or misreport sums, based on homomorphic vector commitments or pairing-based aggregation proofs (Behnia et al., 2023, Wen et al., 19 May 2025).
5.3 Fault Tolerance and Dropout Resilience
Mechanisms for resilience to client dropout include:
- Threshold mask/secret sharing: Aggregation can proceed when any t of n complete; unmasking or threshold decryption combines t shares (Zhang et al., 2023, Tian et al., 2021).
- Resharing/blame protocols: If a dropout is detected, secret shares or decryption shares are reshared for recovery (Zhang et al., 2023).
- Built-in, group-based redundancy: Group designs (e.g., ShardAgg) and code-based (FFT) approaches natively tolerate high dropouts, without explicit recovery (Stevens et al., 2022, Kadhe et al., 2020).
5.4 Limitations
- Malicious robustness: Most protocols provide formal proofs under the semi-honest model; malicious security requires added integrity checks, signature schemes, or interactive proofs, which can increase cost (Wen et al., 19 May 2025).
- Bandwidth and latency: Homomorphic encryption, while computationally light for some schemes, can entail large ciphertext expansion (e.g., ≈12× over plaintext for MK-CKKS in Hyb-Agg (Emmaka et al., 28 Nov 2025)).
- Hardware trust: TEE-based variants improve performance but require trust in CPU manufacturers and are vulnerable to side-channel attacks (Laage et al., 11 Apr 2025).
6. Specializations and Advanced Designs
- Asynchronous protocols: Buffered Asynchronous Secure Aggregation (BASA) enables secure aggregation in asynchronous federated learning, mitigating straggler effects and device heterogeneity (Wang et al., 2024).
- Heterogeneous aggregation: SVAFD generalizes secure aggregation to settings like federated distillation, where clients hold heterogeneous models and outputs are logits—not weights—requiring verifiable multilateral co-aggregation and filtering of malicious updates (Wen et al., 19 May 2025).
- Differential privacy through stateful primitives: Secure stateful aggregation supports advanced DP mechanisms (e.g., DP-FTRL) by enabling the server to store and later read linear combinations of correlated noisy aggregates without trusted curator assumptions (Ball et al., 2024).
- Protocol-level defenses: Recent analysis exposes flaws in static-masking schemes such as MicroSecAgg (Zhang et al., 2024) and prescribes using per-iteration unpredictable masking keys (PRFs) to preserve privacy across rounds.
7. Applications, Impact, and Future Directions
Secure aggregation protocols underpin privacy-preserving federated learning across mobile devices, large-scale IoT deployments, wireless sensor networks, and multi-party distributed analytics. They are core to Google's production FL, industrial cross-silo collaborative analytics, and sensor networks in adversarial environments (Zhang et al., 2023, Sen, 2011, Sen, 2012).
Open problems include:
- Extending robust aggregation to fully malicious models with efficient zero-knowledge proofs (Wang et al., 2024, Wen et al., 19 May 2025).
- Reducing bandwidth expansion in high-dimensional models and CKKS-based schemes (Emmaka et al., 28 Nov 2025).
- Dynamic membership support for continuously joining and leaving clients, especially in IoT and edge settings (Wang et al., 2024).
- Seamless integration with advanced DP mechanisms and deployment in resource-constrained or bandwidth-limited environments (Ball et al., 2024, Emmaka et al., 28 Nov 2025).
Recent research demonstrates that by combining advanced cryptographic techniques (homomorphic encryption, secret sharing, TEEs), communication-efficient designs (group-based sharding, one-shot aggregation), and rigorous composable security models, secure aggregation protocols can meet the scalability, efficiency, and robustness demands of modern large-scale distributed learning and privacy-sensitive analytics (Kadhe et al., 2020, Stevens et al., 2022, Zhang et al., 2023).