Packed Shamir Secret Sharing (PSS)
- Packed Shamir Secret Sharing is a method that encodes a vector of secrets into a single polynomial, enabling parallel sharing while preserving threshold privacy.
- It optimizes secure multi-party computation by reducing communication overhead and increasing throughput, notably for deep neural network inference.
- The scheme employs VM-RandTuple structures and filter packing techniques to support efficient vector–matrix multiplication and convolution with maintained t-privacy.
Packed Shamir Secret Sharing (PSS) is a generalization of Shamir’s -threshold secret sharing scheme, enabling the encoding of a vector of secrets into a single polynomial of degree over a finite field. This structure permits parallel, or "packed," computation over multiple secret values with communication and round complexity closely matching that for sharing a single value. PSS is particularly designed to enhance throughput and scalability in secure multi-party computation (MPC), especially for deep neural network inference in honest-majority settings, by reducing the otherwise prohibitive communication overhead and enabling high degrees of parallelism (Zhang et al., 19 Jan 2026).
1. Formal Definition and Construction
PSS operates over the field , where is a Mersenne prime (with typical choices ), optimizing arithmetic efficiency. For servers and packing factor , the threshold is set to for privacy. A vector of secrets is packed as the coefficients of a degree- polynomial:
Each server is assigned a publicly known, pairwise-distinct point and receives a share . Any shares suffice to reconstruct the entire vector by Lagrange interpolation, whereas any or fewer reveal nothing. Extraction of the th secret utilizes the Lagrange coefficients:
where are determined by the interpolation basis at the corresponding evaluation points. This construction provides parallel privacy and reconstruction for the secrets at the cost of a single polynomial evaluation per server, achieving packing efficiency while maintaining threshold security properties (Zhang et al., 19 Jan 2026).
2. VM-RandTuple Structures for Vector–Matrix Multiplication
Efficient secure vector–matrix multiplication in the MPC context leverages Vector-Matrix Multiplication–Friendly Random Share Tuples (VM-RandTuples). A VM-RandTuple is a pair , where is a packed vector in and produces a packed sum for each output coordinate. The protocol generates these tuples offline using a two-round Vandermonde-matrix method, with each server secret-sharing random values and exchanging linear combinations via the transposed Vandermonde matrices. Privacy for up to colluding servers is preserved, as each PSS instance ensures information-theoretic secrecy for packs of secrets.
This structure allows vector–matrix or matrix–matrix products to be performed in parallel across all packed values per lane. Field element complexity for VM-RandTuple generation is per server offline and per server online (Zhang et al., 19 Jan 2026).
3. Filter Packing for Parallel Secure Convolution
PSS enables efficient packing of filters for secure convolutional neural network evaluation by grouping filters into a single PSS value for each spatial weight position. For a convolutional layer with filters (each of shape ), packing is performed so that each position has
mapped into a degree- polynomial . Input tensors are similarly packed. Convolution is then reduced to a packed inner product and an add-and-truncate operation, enabling simultaneous processing across all channels with negligible overhead over a single channel computation. Padding is efficiently handled by packing zeros, incurring no extra communication. The offline complexity per server for this operation is field elements in 4 rounds, and the online complexity is field elements in one round, where is the post-unfolding matrix shape (Zhang et al., 19 Jan 2026).
4. Efficient Parallel Non-Linear Operations
All Boolean and bitwise operations can be performed in parallel across all packed values within a single PSS instance. For example, prefix-OR (critical to comparisons) is computed via a binary tree of DN-style multiplications in rounds. Bitwise less-than is implemented by decomposition, XOR, and prefix-OR, with all comparisons done simultaneously inside one PSS in rounds. For DReLU/ReLU, the protocol masks $2x$ with random , evaluates one bitwise less-than, and corrects the result; DReLU is completed in rounds, and ReLU requires one additional multiplication round. Maxpool combines ReLU and pairwise comparisons in rounds, all parallelized. Every invocation of a DN-style multiplication or degree transformation maintains -privacy inherent to PSS (Zhang et al., 19 Jan 2026).
5. Performance Metrics and Empirical Scalability
For packing factor , PSS realizes significant reductions in both communication and computational overhead across secure inference protocols.
| Operation | Offline rounds/comm. | Online rounds/comm. |
|---|---|---|
| Vector-matrix multiplication | 2 rounds, | 1 round, |
| Convolution | 4 rounds, | 1 round, |
| ReLU | online rounds | communication/lane |
Empirical evaluations (11–63 servers, , fixed-point 13 bits) indicate communication reductions compared to Shamir-only schemes of Liu et al. (USENIX’24) up to (offline), (online), (total) and speedups up to (offline), (online), (total) on wide-area networks. These improvements, especially for deeper architectures like VGG16 run with up to 63 servers, are due to the parallelization enabled by PSS. In local area networks, where computation dominates, offline speedups up to and total up to are observed on deep networks (Zhang et al., 19 Jan 2026).
6. Cryptographic and Practical Implications
PSS maintains the -threshold privacy and reconstruction guarantees per pack of values, with each operation—linear or nonlinear—executed in parallel over all lanes. This property enables throughput and scalability increases by roughly a factor of for both linear layers (matrix operations) and elementwise functions, with little connectivity or round overhead. The ability to parallelize across many secrets positions PSS as a practical primitive for secure, high-throughput computation in multi-party inference and cryptographic ML, overcoming the severe scalability and latency limitations of classical Shamir-based MPC protocols in network-constrained environments (Zhang et al., 19 Jan 2026).