Papers
Topics
Authors
Recent
Search
2000 character limit reached

Functional Batch Codes

Updated 26 January 2026
  • Functional Batch Codes are families of linear codes designed for distributed storage systems that enable disjoint recovery sets to serve any multiset of k linear combination requests.
  • They generalize standard k-batch and functional PIR codes, employing simplex and Hadamard constructions to achieve near-optimal code lengths and performance.
  • Current research leverages combinatorial designs, algebraic methods, and algorithmic recovery strategies to improve bounds and address open problems in code optimality and locality.

A functional batch code is a family of linear codes designed for distributed storage systems which guarantee that any multiset of kk requests—each a linear combination of ss independent information symbols—can be answered by kk disjoint recovery sets of servers, with each recovery set yielding its respective requested combination. The central question is, for given ss and kk, what is the minimum code length nn (i.e., the minimum number of servers) required for such a code. Functional batch codes generalize standard kk-batch codes and functional PIR codes, and are deeply connected to simplex code constructions, combinatorial designs, and storage-efficient distributed retrieval.

1. Definition and Core Properties

A linear functional kk-batch code of dimension ss and length nn, denoted FB(n,s,k)\operatorname{FB}(n,s,k), is specified by a generator matrix GF2s×nG\in\mathbb{F}_2^{s\times n}. Each server jj stores a code symbol yj=gjxy_j=g_j\cdot x, where xF2sx\in\mathbb{F}_2^s is the information vector and gjg_j is the jj-th column of GG. For any multiset of kk request vectors v1,,vkF2sv_1,\ldots,v_k\in\mathbb{F}_2^s, there exist kk pairwise disjoint recovery sets R1,,Rk[n]R_1,\ldots,R_k\subseteq [n] such that for each ii,

jRiyj=vix,\sum_{j\in R_i} y_j = v_i \cdot x,

with all computations over F2\mathbb{F}_2. The code must serve all such multisets, including repeated and arbitrary requests. Special cases include:

  • Functional kk-PIR code (FP): all kk requests coincide.
  • kk-batch code: each request is a unit vector (single information symbol).

Standard parameters include FB(s,k)=min{n: FB(n,s,k)}\operatorname{FB}(s,k)=\min\{n: \exists\ \operatorname{FB}(n,s,k)\} and analogous PIR/batch parameters (Yohananov et al., 2021).

2. Main Bounds and Existence Results

Functional batch codes are subject to tight bounds relating ss, kk, and nn.

Binary Codes, Maximal Batch Size:

  • Conjecture (Zhang–Etzion–Yaakobi): For k=2s1k=2^{s-1}, the minimal code length is n=2s1n=2^s-1, i.e., FB(s,2s1)=2s1\operatorname{FB}(s,2^{s-1})=2^s-1 (Yohananov et al., 2021, Yohananov et al., 19 Jan 2025).
  • Verified for s5s\leq5 via computer-assisted proofs; all known constructions use the binary simplex code, whose columns enumerate nonzero vectors of F2s\mathbb{F}_2^s.

Improved Existence:

  • Hadamard-based Construction: There exists an FB(2s1,s,k)\operatorname{FB}(2^s-1, s, k) code for k=562s1sk=\left\lfloor \frac{5}{6}2^{s-1}\right\rfloor-s, closing the prior gap where only k2s1/2k\approx 2^{s-1}/2 was achievable (Yohananov et al., 2021).
  • Optimal for k=2sk=2^s: FB(s,2s)=2s+12\operatorname{FB}(s,2^s)=2^{s+1}-2; constructed via double-Hadamard matrices (Yohananov et al., 2021).
  • For general kk, tight lower bounds are supplied (e.g., sphere-covering arguments), with asymptotic minimum length FB(s,k)klog2(k+1)s\operatorname{FB}(s,k)\gtrsim \frac{k}{\log_2(k+1)}s as ss\to\infty (Zhang et al., 2019).

Generalized Regimes (Nonbinary):

  • Over Fq\mathbb{F}_q, for k,tk,t, and qq,

t+k1FP(k,t,q)FB(k,t,q)kt,t+k-1 \leq \operatorname{FP}(k,t,q) \leq \operatorname{FB}(k,t,q) \leq k t,

with explicit values computed for small parameters, e.g., FB(2,t,q)=t+qtq+2\operatorname{FB}(2,t,q) = t + \lceil \frac{q t}{q+2}\rceil (Kilic et al., 4 Aug 2025).

  • Asymptotically, for tt\to\infty, FB(k,t,q)/t2(qk1)qk+q2\operatorname{FB}(k,t,q)/t\to \frac{2(q^k-1)}{q^k+q-2} (Kilic et al., 4 Aug 2025).

3. Constructions: Simplex and Hadamard Codes

The most prominent constructions for functional batch codes use simplex (Hadamard) codes:

  • Simplex Code Construction: The [2s1,s][2^s-1, s] simplex code, with generator matrix containing all nonzero vectors of F2s\mathbb{F}_2^s, achieves functional batch codes optimal for k2s1O(2s1)k\leq 2^{s-1}-O(2^{s-1}). For s4s\leq4, it serves k=2s1k=2^{s-1} requests (Kong et al., 2023). The double-simplex [2s+12,s][2^{s+1}-2, s] serves k=2sk=2^s requests and is optimal (Yohananov et al., 2021).
  • Coset-graph and polynomial partitioning: Recovery sets correspond to disjoint cosets or pair partitions in the support space, relying on combinatorial and algebraic methods, including Nullstellensatz and Vandermonde matrix criteria (Yohananov et al., 19 Jan 2025).

Algorithmic Recovery (Combinatorial):

Codes exploit multigraph decompositions (using an offset vector x^\hat{x}). Cycles in the multigraph correspond to recovery set partitions, and recovery is guaranteed via careful path selection and reordering, ensuring disjoint recovery sets and collision avoidance (Yohananov et al., 2021).

4. Lower Bounds, Asymptotics, and Rate Analysis

Information-theoretic techniques provide fundamental limits:

  • Redundancy bounds: For functional (s,t)(s,t)-batch codes, the minimum redundancy rF(n;s,t)r(n;s,t)r_F(n;s,t)\geq r(n;s,t) matches ordinary batch code bounds, with explicit polynomial-counting inequalities given (Kong et al., 2023).
  • Labelling recursion for restricted locality: For batch codes with maximum recovery set size rr, code length must satisfy

nt(t+1)/(2r)+[(2k1)(r1)!]1/r,n \geq t - (t+1)/(2r) + [(2^k-1)(r-1)!]^{1/r},

growing exponentially with kk for constant rr (Oksner et al., 18 Jan 2026).

  • Asymptotic tightness: For r=2r=2, optimal constructions use double-simplex codes with n=2k+12n=2^{k+1}-2 columns for t=2kt=2^k. Double-simplex constructions achieve near-optimality within a factor of 2 for small kk (Oksner et al., 18 Jan 2026).

5. Functional Batch Array Codes and Locality Constraints

Functional batch codes are extensively studied in array format, generalizing recovery to multiple requests and multiple reads per storage column (Nassar et al., 2020):

  • (s,k,m,t,\ell) functional batch array codes: Designed for arrays of t×mt\times m bits, allowing up to \ell bits to be read from each column per request. Recovery sets are subsets of columns, bounded by locality rr.
  • Lower bounds via Stirling counts: Minimal number of columns mm grows with the number of possible requests and the combinatorics of partitioning recovery sets.
  • Construction paradigms: Codes arise from partitioning the data space into spreads, using combinatorial designs, and covering code reductions, providing flexibility in controlling retrieval locality and code rate.

6. Algebraic and Graph-Theoretic Methods

Recent advances recast the existence question as equivalent to algebraic and graph-theoretic problems:

  • Pairing and polynomial criteria: The optimality conjecture for k=2s1k=2^{s-1} is equivalent to finding partitions of F2s\mathbb{F}_2^s into kk pairs satisfying specified vector sums, or establishing non-vanishing of special polynomials in a suitable quotient ring (Yohananov et al., 19 Jan 2025).
  • Nullstellensatz and Vandermonde conditions: Codes corresponding to full-rank Vandermonde-type matrices over extension fields yield new sufficiency criteria, broadening known ad-hoc sufficient conditions and relating combinatorial existence to algebraic nondegeneracy (Yohananov et al., 19 Jan 2025).

7. Open Problems and Future Directions

Despite rapid progress, several questions remain unresolved:

  • Full generality of the simplex code conjecture: The statement FB(s,2s1)=2s1\operatorname{FB}(s,2^{s-1})=2^s-1, while verified for small ss, is open for all ss. Algebraic sufficient conditions have been recognized for large parameter ranges, but exhaustive or combinatorial proofs are lacking (Yohananov et al., 19 Jan 2025).
  • Extension to nonbinary and larger fields: Precisely determining minimal code lengths for functional batch codes over Fq\mathbb{F}_q, especially for tqkt \sim q^k, remains open, with ongoing research offering asymptotic and constructive bounds (Kilic et al., 4 Aug 2025).
  • Locality-optimal constructions: Explicit codes for fixed locality r>2r > 2 matching established lower bounds are essentially unknown; finding such families remains an outstanding challenge (Oksner et al., 18 Jan 2026).
  • Tight asymptotics for small locality: Quantifying the precise constant-factor increase in code length when constraining recovery sets to small size is an active research front (Oksner et al., 18 Jan 2026).

The study of functional batch codes continues to impact storage system design, random I/O codes, and associated combinatorial structures, with simplex-based, algebraic, and array formulations providing the foundation for current and future advances.

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Functional Batch Codes.