Covering Arrays in Combinatorics

Updated 29 January 2026

Covering arrays are combinatorial matrices that ensure every t-wise interaction among parameters appears at least once, providing rigorous test coverage.
They are constructed using probabilistic methods, group-theoretic techniques, and algorithmic tools like the Lovász Local Lemma to achieve minimal test sizes.
Applications span software testing, combinatorial design, and extremal set theory, driving research on asymptotic bounds and optimal configurations.

A covering array is a fundamental combinatorial object, central to interaction testing, combinatorial design, and extremal set theory. It is an $N \times k$ array over a $v$ -ary alphabet in which every $N \times t$ subarray contains all possible $v^t$ tuples in its rows, guaranteeing exhaustive coverage of all $t$ -way interactions among $k$ parameters, each with $v$ levels. The covering array number $\mathrm{CAN}(t, k, v)$ is the minimum $N$ for which such an array exists.

1. Formal Definitions and Basic Properties

A covering array $\mathrm{CA}(N; t, k, v)$ is an $N \times k$ matrix $A$ over $\{0,1,\ldots,v-1\}$ with the property

$\forall\,T=\{i_1,\ldots,i_t\}\subseteq\{1,\ldots,k\},\quad \text{and}\ x\in\{0,\ldots,v-1\}^t,\ \exists\,r\in\{1,\ldots,N\}\ \text{such that}\ (A_{r,i_1},\ldots,A_{r,i_t}) = x.$

The covering array number: $\mathrm{CAN}(t,k,v) = \min\{ N : \mathrm{CA}(N;t,k,v)\ \text{exists} \}.$ Orthogonal arrays $OA_\lambda(t,k,v)$ require each $t$ -tuple to appear exactly $\lambda$ times in every $t$ -column subarray; covering arrays require at least one occurrence. Covering arrays generalize orthogonal arrays and provide the minimal test size needed to guarantee $t$ -wise coverage (Hiess et al., 20 Oct 2025).

2. Asymptotic Bounds and Constructions

Logarithmic Growth

The classical result (Katona–Kleitman–Godbole) is

$\mathrm{CAN}(t,k,v) = \Theta(\log_2 k)$

for fixed $t$ , $v$ and $k \to \infty$ (Francetić et al., 2015).

Probabilistic and Analytical Bounds

The Lovász Local Lemma (LLL) yields: $\mathrm{CAN}(t,k,v) \leq \frac{(t-1)\,\log_2 k}{\log_2(v^t/(v^t-1))}(1+o(1))$ with more refined constructions, e.g., fixed-weight columns, improving constants for small $t$ and $v$ (Yuan et al., 2014).

Entropy-compression (algorithmic LLL) further refines constants: $d(t,v) \leq \frac{v(t-1)}{\log_2(v^{t-1}/(v^{t-1}-1))}$ and, via multivariable optimization, to (Francetić et al., 2015): $d(t,v) \leq \frac{v(t-1)}{ (t-1)\log_2( v^v/(v-1)^{v-1} ) - f_0(t,v) }$ for $d(t,v) := \limsup_{k \to \infty}\mathrm{CAN}(t,k,v)/\log_2 k$ .

3. Exact Results for Small Parameters and Uniqueness

Binary Arrays and Strength Two

For $q=2$ , $t=2$ , maximal binary 2-covering arrays are unique up to equivalence, given by the standard maximal array—an $m \times \binom{m-1}{\lfloor m/2 \rfloor-1}$ matrix whose columns are all binary vectors of fixed weight (Choi et al., 2011).

Computational Determination

Modern computational methods (isomorph-free exhaustive search, canonical augmentation) have determined exact optimal covering arrays for 21 strength-two cases with $v>2$ , $N>v^2$ (Kokkala et al., 2019). All exhibit uniformity (equal symbol-frequency in each column).

4. Generalizations: Mixed, Hypergraph, and Sequence Covering Arrays

Mixed Covering Arrays

Generalized covering designs handle multi-part alphabets and block sizes, providing lower and upper bounds via Schőnheim's bound, edge-counting, and recursive constructions (Bailey et al., 2010). Partial and mixed-level arrays are realized by partial covering arrays and hypergraph-based models.

Hypergraph Covering Arrays

Covering arrays on $r$ -uniform hypergraphs restrict coverage to prescribed interactions, e.g., only certain triples. Inductive construction uses hooking operations (vertex/edge additions), achieving optimally small arrays for $\alpha$ -acyclic and conformal hypertrees (Akhtar et al., 2015).

Sequence and Perfect Sequence Covering Arrays

Sequence covering arrays (SCAs) are sets of permutations covering all ordered $k$ -subsequences; perfect sequence covering arrays (PSCAs) require exact multiplicity $\lambda$ (Na et al., 2022, Gentle et al., 2022). The minimal such $\lambda$ is denoted $g(n,k)$ . For $(n,k) \in \{(5,3),(6,3),(7,3),(7,4)\}$ , $g(n,k)=2$ ; for $(8,3)$ , $g(8,3)=3$ (Na et al., 2022). PSCAs are tightly connected to directed designs and deletion-correcting codes.

5. Partial and Relaxed Covering Arrays

Relaxed requirements give rise to partial covering arrays, covering only a fraction of $t$ -sets or tuples (Sarkar et al., 2016). Important results:

Partial covering arrays with fraction $\alpha$ :

$N = O\left( \frac{v^t(t-1)\,\ln k}{v^t - m + 1} \right)$

$\epsilon$ -almost covering arrays:

$N = O( v^t \ln( v^{t-1}/\epsilon ) )$

Moser–Tardos resampling and Markov-type randomized algorithms ensure efficient generation, matching information-theoretic lower bounds up to constant factors.

6. Arrays with Higher Index (Replication) and Constraints

Covering Arrays of Index $\lambda$

For $\lambda>1$ (every $t$ -tuple occurs at least $\lambda$ times), the main asymptotic bound is (Calbert et al., 2022): $\mathrm{CAN}_\lambda(t,k,v) = \Theta_{v,t}(\log k + \lambda)$ removing previous $\lambda \log \log k$ terms. Improved leading constants are obtained via the Lovász Local Lemma and two-stage alteration schemes; graph coloring yields further reductions for higher $\lambda$ .

SAT/MaxSAT-Based Construction

Satisfiability-based encodings of the covering array problem allow for exact and suboptimal solving, even under additional constraints (forbidden tuples, system-specific restrictions) (Ansótegui et al., 2021). MaxSAT variants minimize test suite size; incomplete MaxSAT is especially effective on large-scale constrained problems.

7. Algebraic and Group-Theoretic Constructions

Finite Field and Group Development Constructions

Maximal sequences (m-sequences), cyclic trace arrays, and group actions (e.g., PGL $(2,q)$ ) are exploited to yield covering arrays of high strength and efficiency (Maity et al., 2015, Tzanakis, 2017). For strength $t=4$ , $g=3$ , explicit bounds $4\text{-}CAN(k,3) \le 12k + 3$ are achieved using projective general linear group constructions and starter vectors. For binary arrays, concatenation of cyclic Hamming codes, self-dual sequence families, interleaving, and primitive polynomial periodicity yield near-optimal covering sequences and arrays (Chee et al., 12 Feb 2025, Chee et al., 2024).

8. Open Problems and Current Research Directions

Key open areas include:

Determining $\mathrm{CAN}(t,k,v)$ for more small ( $t,k,v$ ) and high strength ( $t>4$ ) parameter sets (Hiess et al., 20 Oct 2025).
Improving leading constants and sharpening lower bounds, especially for partial arrays and arrays of higher index.
Extending algebraic and group-theoretic constructions to broader parameter ranges, especially using cyclotomy and discrete logarithms (Tzanakis, 2017).
Classification and existence problems for mixed, hypergraph, and constrained covering arrays.
Connections to perfect sequence covering arrays, deletion codes, and directed $t$ -designs.
Systematic study of optimal arrays for non-binary alphabets and verification of the uniformity conjecture (Kokkala et al., 2019).
Efficient SAT/MaxSAT and enumeration algorithms capable of exact or near-optimal constructions in real-world, large-scale systems.

Table: Summary of Covering Array Number Bounds

Type	Primary Bound or Complexity	Reference
Classical ( $t$ -way, full)	$O((t-1)v^t\log k)$ ; $\Theta(v^t\log k)$	(Sarkar et al., 2016)
Logarithmic growth ( $t$ fixed)	$\Theta(\log_2 k)$	(Francetić et al., 2015)
Probabilistic/LLL bound	$(t-1)\log_2 k/\log_2(v^t/(v^t-1))$	(Yuan et al., 2014)
Entropy compression	$v(t-1)/\log_2[v^{t-1}/(v^{t-1}-1)]$	(Francetić et al., 2015)
Higher index $\lambda$	$\Theta(\log k+\lambda)$	(Calbert et al., 2022)
Small $q=2, t=2$ (exact)	$(m-1\choose \lfloor m/2 \rfloor-1)$ columns	(Choi et al., 2011)
Partial/relaxed coverage	$O(v^{t-1}\log k)$ for small relaxations	(Sarkar et al., 2016)
Algebraic/group development ( $t\ge3$ )	Polynomial/construction-dependent	(Maity et al., 2015)

Covering arrays remain a rich, continually evolving domain at the intersection of extremal combinatorics, algorithmic theory, algebraic design, and practical test suite optimization.