Multivariate Polynomial Coding Schemes

Updated 20 January 2026

Multivariate polynomial coding schemes are defined by evaluating m-variate polynomials over finite fields, enabling robust error correction and tunable locality.
They generalize univariate Reed–Solomon and Reed–Muller codes by incorporating advanced lifting and multiplicity techniques to optimize rate, distance, and decoding complexity.
These schemes underpin applications in distributed storage, coded computing, and secure computation by balancing trade-offs among rate, locality, and computational efficiency.

Multivariate polynomial coding schemes encode information into codewords defined via the evaluation or symbolic manipulation of multivariate polynomials over a finite field, supporting a variety of advanced requirements including robust error correction, locality, privacy, batch retrieval, and efficient distributed computation. These schemes generalize the theory and techniques of univariate Reed–Solomon and Reed–Muller codes to higher-dimensional algebraic settings, enabling new trade-offs between rate, locality, recoverability, and computational complexity.

1. Fundamental Constructions and Definitions

Multivariate polynomial codes are built on the principle of evaluating $m$ -variate polynomials $f \in \mathbb{F}_q[X_1, \ldots, X_m]$ at structured or arbitrary collections of points in $\mathbb{F}_q^m$ . The classical form is the Reed–Muller code: $\mathrm{RM}_q(m,d) = \{ (f(\mathbf{a}))_{\mathbf{a}\in A} : f \in \mathbb{F}_q[X_1,\ldots,X_m],~\deg f \leq d \}$ with $A \subseteq \mathbb{F}_q^m$ , usually $A = \mathbb{F}_q^m$ itself. The dimension is $\binom{m+d}{d}$ and the minimum distance depends on $d$ and $q$ .

Advanced families, such as lifted Reed–Solomon, lifted multiplicity, quadratic-lifted, and high-rate evaluation codes, are defined via constraints on the behavior of $f$ restricted to affine lines or curves. The lifted Reed–Solomon code, for instance, consists of all $f \in \mathbb{F}_q[X_1, \ldots, X_m]$ 0 whose restriction to every line $f \in \mathbb{F}_q[X_1, \ldots, X_m]$ 1 is a univariate polynomial of degree $f \in \mathbb{F}_q[X_1, \ldots, X_m]$ 2. The lifted multiplicity code further requires agreement of all Hasse derivatives up to a fixed order on such lines, supporting systematic encoding of both values and derivatives (Holzbaur et al., 2021).

Multiplicity codes generalize pointwise evaluation to incorporate derivatives (classically Hasse, or divided differences for arbitrary characteristic), extending the code’s redundancy and error-correction capability (Bhandari et al., 2020, Venkitesh, 2024). These extensions enable local decoding, batch reconstruction, and high-rate codes unattainable by traditional Reed–Muller.

2. Code Parameters: Rate, Distance, and Locality

The rate, minimum distance, and locality of multivariate polynomial codes are tied to the algebraic and combinatorial structure of the underlying evaluation set and constraints.

Rate:

For lifted Reed–Solomon and multiplicity codes, one counts “good” (admissible) monomials whose restrictions to any line remain in the prescribed univariate code. Asymptotically, for $f \in \mathbb{F}_q[X_1, \ldots, X_m]$ 3, the dimension satisfies

$f \in \mathbb{F}_q[X_1, \ldots, X_m]$ 4

where $f \in \mathbb{F}_q[X_1, \ldots, X_m]$ 5 is associated to a specific matrix (Holzbaur et al., 2021).

Distance:

For lifted multiplicity codes, the distance lower bound is

$f \in \mathbb{F}_q[X_1, \ldots, X_m]$ 6

for $f \in \mathbb{F}_q[X_1, \ldots, X_m]$ 7, $f \in \mathbb{F}_q[X_1, \ldots, X_m]$ 8 the multiplicity order (Holzbaur et al., 2021).

Locality and Availability:

Codes constructed using restrictions to low-degree curves (e.g., rigid quadratic polynomials) offer locality $f \in \mathbb{F}_q[X_1, \ldots, X_m]$ 9 (the number of erasures that can be corrected by decoding within a recovery set) and exponentially many disjoint recovery sets per symbol, with explicit bounds on the dimension and distance (Liu, 7 Jan 2025).

High-rate constructions utilizing specifically designed simplicial or algebraic evaluation sets achieve rates $\mathbb{F}_q^m$ 0 for fixed $\mathbb{F}_q^m$ 1, with relative distance $\mathbb{F}_q^m$ 2, overcoming the $\mathbb{F}_q^m$ 3 barrier of product-set Reed–Muller codes (Kopparty et al., 2024).

3. Decoding and Self-Correction Algorithms

Multivariate polynomial coding schemes often admit efficient unique or list decoding algorithms that exploit the algebraic structure of the codes:

List Decoding:

Multiplicity codes on general grids can be list-decoded up to their (Schwartz–Zippel) distance $\mathbb{F}_q^m$ 4, even beyond the Johnson bound—by applying the multivariate polynomial method and structured “gluing” of derivatives; the solution space for polynomials consistent with the received word has polynomial or constant size for fixed $\mathbb{F}_q^m$ 5 (Bhandari et al., 2020). For divided-difference multiplicity codes, list decoding is feasible up to the distance even in small characteristic, circumventing obstructions found in the Hasse-derivative setting (Venkitesh, 2024).

Local Self-Correction:

Lifted multiplicity codes enable sublinear local self-correction: to correct the values and derivatives at a point, one samples sets of directions, decodes the function restricted to each line using univariate multiplicity decoding, and solves a linear system to reconstruct the value and all derivatives at the target point. The query complexity is $\mathbb{F}_q^m$ 6 (Holzbaur et al., 2021).

Batch and PIR Code Construction:

Explicit constructions show that lifted Reed–Solomon codes support $\mathbb{F}_q^m$ 7 batch queries (recovery from $\mathbb{F}_q^m$ 8 disjoint sets), with redundancy $\mathbb{F}_q^m$ 9 for suitable $\mathrm{RM}_q(m,d) = \{ (f(\mathbf{a}))_{\mathbf{a}\in A} : f \in \mathbb{F}_q[X_1,\ldots,X_m],~\deg f \leq d \}$ 0, and multiplicity codes yield PIR codes with specific redundancy and block-alphabet sizes (Holzbaur et al., 2021).

4. Structural Properties: MDS, GM-MDS, and Higher-Order Capabilities

Polynomial codes have rich structural properties connecting maximum distance separability and higher-order MDS (GM-MDS) characteristics:

GM-MDS Characterization:

The generalized GM-MDS theorem holds for all polynomial codes: for any zero-pattern subject to Hall-type constraints, there exists a choice of evaluation points ensuring every $\mathrm{RM}_q(m,d) = \{ (f(\mathbf{a}))_{\mathbf{a}\in A} : f \in \mathbb{F}_q[X_1,\ldots,X_m],~\deg f \leq d \}$ 1 minor of the generator matrix is nonzero, i.e., all such codes attain all generic zero-patterns. This also applies to their duals, even when the dual code is not a polynomial code (Brakensiek et al., 2023).

Higher-Order MDS (MDS( $\mathrm{RM}_q(m,d) = \{ (f(\mathbf{a}))_{\mathbf{a}\in A} : f \in \mathbb{F}_q[X_1,\ldots,X_m],~\deg f \leq d \}$ 2)) and List Decoding:

Multivariate polynomial codes and their duals are MDS( $\mathrm{RM}_q(m,d) = \{ (f(\mathbf{a}))_{\mathbf{a}\in A} : f \in \mathbb{F}_q[X_1,\ldots,X_m],~\deg f \leq d \}$ 3) for all $\mathrm{RM}_q(m,d) = \{ (f(\mathbf{a}))_{\mathbf{a}\in A} : f \in \mathbb{F}_q[X_1,\ldots,X_m],~\deg f \leq d \}$ 4, establishing that both they and randomly-punctured algebraic-geometric codes achieve list-decoding capacity at optimal rate and constant output list size over appropriately large fields. The dual MDS( $\mathrm{RM}_q(m,d) = \{ (f(\mathbf{a}))_{\mathbf{a}\in A} : f \in \mathbb{F}_q[X_1,\ldots,X_m],~\deg f \leq d \}$ 5) property implies optimal list-decodability via the Singleton bound (Brakensiek et al., 2023).

These properties extend to evaluation codes on irreducible varieties, not just affine points, supporting generalizations fundamental for modern coding theory.

5. Applications in Distributed Storage, Secure Computation, and Matrix Computation

Multivariate polynomial coding schemes are foundational in multiple modern applications:

Distributed Storage:

Codes based on multivariate polynomials (notably Reed–Muller, lifted, and quadratic-lifted codes) enable efficient exact-repair schemes: a failed node is recovered by suitable combinations (traces) of codeword symbols from surviving nodes, exploiting dual code and polynomial trace properties, with explicit bandwidth and sub-packetization formulas (López et al., 31 Dec 2025, Liu, 7 Jan 2025).

Coded Computing:

Multivariate schemes are used for efficient matrix-matrix and matrix-chain multiplications in distributed settings, enabling reduced per-worker storage and communication overhead compared to naive univariate lifts. Parameterizations allow tuning the trade-off between computation and communication overheads, and support arbitrary matrix partitions and recovery thresholds (Gómez-Vilardebò, 13 Jan 2026, Gómez-Vilardebó et al., 2024).

Privacy and PIR:

Private polynomial computation is achievable via multivariate star-product codes and Lagrange encoding, allowing users to compute arbitrary multivariate polynomial functions of coded data without revealing the function, and with improved download rates compared to prior constructions (Obead et al., 2019).

Secure Distributed Computation:

Every function over a finite field, viewed as a multivariate polynomial (or symmetric tensor), is amenable to secure, straggler- and Byzantine-tolerant distributed computation schemes with optimal recovery thresholds tied to the multiplicative complexity (tensor rank) and a genus-type overhead (arising from algebraic-geometric codes or other log-additive codes) (Soto, 25 Apr 2025).

6. Algorithmic and Implementation Considerations

Efficient evaluation and code generation for large multivariate polynomials are crucial for practical deployments:

Optimizing Evaluation:

Multivariate Horner schemes generalize univariate evaluation, with the variable ordering impacting operation count. Monte Carlo Tree Search (MCTS) effectively explores the variable-ordering space, producing evaluation code with up to $\mathrm{RM}_q(m,d) = \{ (f(\mathbf{a}))_{\mathbf{a}\in A} : f \in \mathbb{F}_q[X_1,\ldots,X_m],~\deg f \leq d \}$ 6 fewer operations compared to classical greedy heuristics—rendering large-scale applications computationally feasible (Kuipers et al., 2012).

Decoding complexity:

Decoders for advanced multivariate codes often reduce to nested univariate polynomial interpolation, root finding, or solving structured linear systems exploiting affine geometry and block structures.

Field characteristic effects:

Classical multiplicity codes require the field size to significantly exceed the degree for efficient list decoding via Hasse derivatives, a restriction bypassed by divided-difference-based variants, which are insensitive to field characteristic (Venkitesh, 2024).

7. Advanced Code Families and Performance Trade-offs

Recent advances address core coding-theoretic bottlenecks:

High-rate multivariate codes:

Explicit evaluation domains (simplices and algebraically constructed sets) enable multivariate codes with rate arbitrarily close to 1 and $\mathrm{RM}_q(m,d) = \{ (f(\mathbf{a}))_{\mathbf{a}\in A} : f \in \mathbb{F}_q[X_1,\ldots,X_m],~\deg f \leq d \}$ 7 distance, defeating the exponential rate suppression of Reed–Muller at fixed $\mathrm{RM}_q(m,d) = \{ (f(\mathbf{a}))_{\mathbf{a}\in A} : f \in \mathbb{F}_q[X_1,\ldots,X_m],~\deg f \leq d \}$ 8 (Kopparty et al., 2024).

Local recovery with high availability:

Quadratic-lifted (and similar geometric) codes combine high rate with locality $\mathrm{RM}_q(m,d) = \{ (f(\mathbf{a}))_{\mathbf{a}\in A} : f \in \mathbb{F}_q[X_1,\ldots,X_m],~\deg f \leq d \}$ 9 and very high availability (e.g., $A \subseteq \mathbb{F}_q^m$ 0 recovery sets in the bivariate quadratic case), outperforming line-based lifted RS codes in rate while supporting similar or superior local recovery (Liu, 7 Jan 2025).

Computation-storage trade-offs:

In distributed matrix computations, generalized multivariate polynomial coding schemes support tuning block partitioning, variable allocation, and code parameters to optimize for application-driven communication and computation resource constraints (Gómez-Vilardebó et al., 2024, Gómez-Vilardebò, 13 Jan 2026).

Optimal thresholds in secure computation:

Log-additive and algebraic-geometric code families ensure that recovery and security thresholds match information-theoretic bounds up to additive genus factors, with stratified support for privacy, robustness, and efficient distributed decoding (Soto, 25 Apr 2025).

References:

(Holzbaur et al., 2021, Bhandari et al., 2020, Venkitesh, 2024, Brakensiek et al., 2023, Liu, 7 Jan 2025, López et al., 31 Dec 2025, Gómez-Vilardebò, 13 Jan 2026, Gómez-Vilardebó et al., 2024, Kopparty et al., 2024, Soto, 25 Apr 2025, Obead et al., 2019, Kuipers et al., 2012)