Secure Aggregation Protocol
- Secure Aggregation Protocol is a method that enables multiple parties to compute an aggregate, like a sum, without revealing individual inputs.
- It employs techniques such as pairwise masking, secret sharing, and homomorphic encryption to ensure privacy even under collusion or adversarial threats.
- Protocols optimize dropout tolerance and efficiency, integrating robust methods like verifiable commitments and asynchronous models for federated and sensor networks.
Secure aggregation protocols are cryptographic and statistical mechanisms allowing multiple parties—clients, sensor nodes, or distributed learners—to compute the sum (or more general aggregate) of their private data, while revealing no information about individual inputs to the central aggregator or colluding parties up to defined thresholds. Secure aggregation is widely utilized in privacy-preserving federated learning and distributed sensor networks, addressing both bandwidth and adversarial threat constraints.
1. Security Definitions, Threat Models, and Goals
Secure aggregation protocols are constructed under adversary models that can include honest-but-curious servers, semi-honest or malicious users, collusions between subsets of users and servers, and the threat of participant dropouts. The foundational privacy requirement stipulates that no coalition up to size (for a user set of size ) and possibly involving the server, can extract any information about honest users’ inputs beyond the revealed function (for example, the sum). Dropout robustness demands that the final aggregate is exact (or within negligible error) as long as at least users participate, with the dropout threshold (Bonawitz et al., 2016, Jahani-Nezhad et al., 2022, Kadhe et al., 2020, Ball et al., 15 Oct 2024).
The formal privacy guarantee can be stated as:
where is the colluding set, and is the specific aggregate function (often sum, but can be more general, e.g., linear weighted sums (Tian et al., 2021)).
Correctness requires that, up to exceptions from at most dropouts per round, the server outputs the sum over all provided inputs, ignoring missing users' values.
2. Cryptographic and Statistical Building Blocks
Pairwise Masking and Secret Sharing
The vast majority of high-throughput secure aggregation protocols rely on either pairwise masking using pseudorandom functions (PRFs) seeded via pairwise key agreement (Bonawitz et al., 2016), or Shamir secret sharing (or variants thereof) (Kadhe et al., 2020, Jahani-Nezhad et al., 2022, Beguier et al., 2020). Most implement double masking—random additive masks for each user plus pairwise masks between user pairs—to ensure that any dropped inputs can have their masks reconstructed and canceled, enabling tolerance to dropouts.
Shamir secret sharing is frequently used for robustness and privacy, where shares of secrets (random masks or private keys) are spread among users such that any subset of size can reconstruct, but any subset learns nothing. This capability is essential for reconstructing masking secrets when users drop out (Bonawitz et al., 2016, Jahani-Nezhad et al., 2022, Kadhe et al., 2020).
Homomorphic Encryption
Some protocols use additively homomorphic encryption (e.g., threshold EC-ElGamal (Tian et al., 2021), BFV/RLWE schemes (Ball et al., 15 Oct 2024, Emmaka et al., 28 Nov 2025)) to enable the server to aggregate encrypted values and, with help from the clients, decrypt only the final aggregate. Threshold decryption is used to prevent any single party from decrypting alone.
Mask Cancellation and Unmasking
Secure aggregation protocols must ensure that masks—whether pairwise, per-user, or polynomial—cancel exactly when summed over all (nondropped) participants, regardless of which users survive the interactive round. This is typically managed by systematic mask arrangement with robust secret sharing and explicit dropout-handling logic (Bonawitz et al., 2016, Jahani-Nezhad et al., 2022).
Advanced Primitives and Verifiability
Recently, authenticated vector commitments (e.g., Pedersen) and succinct zero-knowledge proofs have been used to ensure verifiability—enabling clients to check that the server did not deviate and that the aggregate reflects the intended computation (Behnia et al., 2023).
3. Secure Aggregation Protocol Design Paradigms
Centralized Synchronous Protocols
The canonical by Bonawitz et al. (Bonawitz et al., 2016) employs a multi-round protocol (key exchange, mask sharing, masked upload, and robust unmasking) suitable for high-dimensional federated learning, tolerating up to user failure per round. FastSecAgg (Kadhe et al., 2020) introduces a three-round protocol leveraging multi-secret sharing via FFT (FastShare), reducing both computation (from to ) and communication.
Table: Feature Comparison (Selected Protocols)
| Protocol | Privacy Threshold | Dropout Robustness | Client Communication | Server Computation |
|---|---|---|---|---|
| Bonawitz et al. | up to | up to | ||
| SwiftAgg (Jahani-Nezhad et al., 2022) | Arbitrary, | up to | per sum | |
| FastSecAgg (Kadhe et al., 2020) | up to | up to | ||
| BASA (Wang et al., 5 Jun 2024) | (buffer size) | |||
| Hyb-Agg (Emmaka et al., 28 Nov 2025) |
Sparse Model Support and Communication/Computation Reduction
Protocols such as "Efficient Sparse Secure Aggregation" (Beguier et al., 2020) and "Secure Aggregation Meets Sparsification in Decentralized Learning" (CESAR) (Biswas et al., 13 May 2024) reduce communication and computation by leveraging sparsity in user updates through compression (top- selection and quantization), while incorporating additional masking/aggregation logic so that privacy is maintained even though users may share updates for disjoint parameter subsets. In CESAR, masking is applied only at the intersection of per-user-sparsified parameter indices, with tunable masking redundancy to resist collusion.
Asynchronous and Buffered Models
BASA (Wang et al., 5 Jun 2024) generalizes secure aggregation to the asynchronous federated learning setting by aggregating users in small, fixed-length buffers and using per-buffer attribute-based encryption to mitigate the lack of synchrony. This eliminates the need for synchronous user interactions or trusted hardware, scaling linearly in buffer size rather than total user count.
Stateful Secure Aggregation
Protocols such as "Secure Stateful Aggregation" (Ball et al., 15 Oct 2024) build a continually updated, encrypted aggregation state that supports arbitrary, privacypreserving linear queries (e.g., for DP-FTRL), overcoming the stateless limitation of classic protocols and enabling streaming aggregation with correlated noise.
One-Shot and Hybrid Protocols
Hyb-Agg (Emmaka et al., 28 Nov 2025) combines multi-key CKKS (MK-CKKS) homomorphic encryption with ECDH-based additive masking to enable constant-size, non-interactive secure aggregation suitable for resource-constrained IoT devices. Each client transmits only a single message per round, and the scheme achieves exact arithmetic sums, IND-CPA security, and resistance to server plus up to colluding clients.
4. Secure Aggregation in Wireless Sensor Networks
In WSNs, secure aggregation must address data confidentiality, data integrity, and robustness to node compromise in extremely resource-constrained settings (Sen, 2010, Sen, 2011, 0803.3448, Sen, 2012). Distinct schemes fall into two main categories:
- End-to-end secrecy: Nodes conceal data using symmetric diffusion or homomorphic encryption such that only the base station can recover the aggregate, while intermediate nodes operate solely on encrypted data (0803.3448).
- Distributed estimation approaches: Nodes maintain statistical estimates (mean, covariance) of the aggregate and broadcast only when local evidence deviates by statistically significant margins. Covariance Intersection (CI) fusion is frequently applied to merge estimates (Sen, 2010, Sen, 2011), with anomaly detection via deviation (e.g., rules) and neighbor consensus to isolate or quarantine malicious insiders.
Perfectly secure aggregation, even in the presence of passive eavesdroppers, is approached via combinatorial schemes (e.g., shifted-projections (Fernández-Duque, 2015)), which offer unconditional perfect secrecy at the cost of increased overhead.
5. Protocol Efficiency, Dropout Tolerance, Scalability, and Practicality
- Communication and Computation: Protocols employing compressed/sparsified updates (Beguier et al., 2020) or group-based sharding with local aggregation (Stevens et al., 2022) deliver sublinear (in ) per-client communication cost, supporting federations of clients with near real-time aggregation.
- Dropout and Asynchrony: Techniques combining per-user masking, Shamir sharing, and buffer or cohort-based partitioning enable robust operation in cross-device/IoT scenarios with high dropout or unreliable connectivity (Wang et al., 5 Jun 2024, Ball et al., 15 Oct 2024).
- Performance: Modern protocols achieve wall clock times per round from 20 ms/user (e.g., e-SeaFL (Behnia et al., 2023)) to <1 second on resource-constrained hardware (Emmaka et al., 28 Nov 2025).
- Verifiability: Recent schemes integrate authenticated vector commitments or succinct proof aggregation, allowing clients to verify the correctness of the server’s aggregation, even in the fully malicious server model (Behnia et al., 2023, Wen et al., 19 May 2025).
- Hybrid Crypto/TEE Approaches: Deployments may combine cryptography (FHE, OT) with hardware-based trusted execution environments (TEEs), enabling spectrum trade-offs between security cost and aggregation latency (Laage et al., 11 Apr 2025).
6. Extensions and Open Problems
- Beyond Sum Aggregation: Protocols extend to weighted sums, linear functions, or any computation supported by the underlying additive homomorphic encryption or secret sharing schemes (Tian et al., 2021).
- Malicious Adversaries: Some protocols provide privacy and correctness in the malicious setting, resisting adversarial participation in both user and server roles (Beguier et al., 2020, Stevens et al., 2022).
- Heterogeneous Models and Federated Distillation: Recent protocols address aggregation for model-heterogeneous settings (distillation), using co-aggregation via multi-point coding and verifiable aggregation with bilinear pairings to defend against both privacy and poisoning attacks (Wen et al., 19 May 2025).
- Tradeoffs: The main frontier remains the balance between communication/computation (especially under high sparsity and asynchrony), privacy threshold and collusion-resilience, and verifiable integrity, with active/faulty adversaries and asynchronicity introducing additional complexity.
7. Representative Protocols and Their Application Domains
| Protocol | Core Security Method | Application Domain | Notable Features | Reference |
|---|---|---|---|---|
| Bonawitz et al. | Pairwise/Dual Masking + Shamir Sharing | Federated Learning | Dropout-tolerance, comm, up to fault tolerance | (Bonawitz et al., 2016) |
| FastSecAgg | FFT-Based Multi-Secret Sharing | Cross-device FL | comp., adaptive corruptions, avg-case dropouts | (Kadhe et al., 2020) |
| Efficient Sparse SA | Additive Sharing + Sparse Compression | Cross-silo FL | Top- sparse compression, comm., malicious privacy | (Beguier et al., 2020) |
| SwiftAgg | Grouped Shamir, Two-Phase Structure | Mobile Federated Learning | Near-linear comm, worst-case security, -collusion | (Jahani-Nezhad et al., 2022) |
| BASA | CP-ABE Masking, Buffer Asynchronous FL | Purely async, comm, up to collusion | (Wang et al., 5 Jun 2024) | |
| Hyb-Agg | MK-CKKS + ECDH Masking | IoT FL, Edge Deployments | Single-round, comm, no decryption shares | (Emmaka et al., 28 Nov 2025) |
| Secure Stateful Agg. | RLWE Key-Homomorphic Encryption | DP-FTRL Federated Learning | Streaming state, correlated noise, arbitrary lin. queries | (Ball et al., 15 Oct 2024) |
| CESAR | Pairwise Masking on Sparse Intersections | Decentralized Learning | Untrusted/serverless, collusion-resilient, comm/acc. tradeoff | (Biswas et al., 13 May 2024) |
| Shard-SS (Stevens et al., 2022) | Hierarchical/Group-based Shamir Sharing | Very-large-scale FL | per-client comm, client scalability | (Stevens et al., 2022) |
| e-SeaFL | Masked Hashing, Verifiable Commitments | Cross-silo FL, Malicious Server | Succinct integrity proof, 1-round semi-honest/malicious | (Behnia et al., 2023) |
| SVAFD | Lagrange-coded co-aggregation, Proofs | Fed. Distillation/Heterogeneous FL | Poisoning defense, explicit verification, logit aggregation | (Wen et al., 19 May 2025) |
Secure aggregation protocols remain a core primitive for privacy-preserving data analytics in both federated and decentralized machine learning. Ongoing advancements address asynchrony, throughput/latency minimization, adversarial robustness, and the generalization to non-standard aggregation tasks.