Sparsity Testing for Low-Degree Polynomials

Updated 18 November 2025

Sparsity testing for low-degree polynomials is a method to efficiently decide if a function is representable as an s-sparse polynomial or is ε-far from any such representation.
The approach leverages random restrictions, implicit learning, and hash-based dimension reduction to achieve significant improvements in query and time complexity.
Key innovations, such as proper exact learning and hitting-set constructions, extend the framework to both Boolean functions and finite field evaluations.

Sparsity testing for low-degree polynomials is concerned with property testing of Boolean or finite field functions $f : \{0,1\}^n \to \mathbb{F}$ (typically $\mathbb{F}_2$ or larger finite fields) to efficiently distinguish whether $f$ is representable as an $s$ -sparse polynomial—i.e., a polynomial with at most $s$ nonzero monomials—or whether it is $\epsilon$ -far (in Hamming distance) from all such $s$ -sparse polynomials. This area lies at the interface of property testing, computational learning theory, and algebraic complexity, and features techniques such as implicit learning, zero-testing via hitting sets, and random hash projections.

1. Formal Definitions and Problem Setting

A Boolean function $f: \{0,1\}^n \to \{0,1\}$ admits a unique representation as a multilinear polynomial over $\mathbb{F}_2$ , i.e.,

$f(x) = \bigoplus_{T \subseteq [n]} a_T \prod_{i \in T} x_i, \quad a_T \in \{0,1\}.$

The function $f$ is $s$ -sparse if the number of nonzero coefficients $a_T$ is at most $s$ . For general finite fields, the same sparsity notion applies but with coefficients in $\mathbb{F}$ .

Given $f$ , $s \in \mathbb{N}$ , and $\epsilon > 0$ , the (uniform-distribution) sparsity testing problem is to design a (randomized) algorithm that with high probability:

Accepts if $f$ is $s$ -sparse,
Rejects if $\operatorname{dist}(f, g) > \epsilon$ for all $s$ -sparse polynomials $g$ , where $\operatorname{dist}(f,g) = \Pr_{x \in \{0,1\}^n}[f(x) \neq g(x)]$ .

For more generality, one can consider $f : S^N \to \mathbb{F}$ for $S \subseteq \mathbb{F}^\times$ , and the analogous sparsity setting over rectangular domains.

2. Algorithmic Frameworks for Sparse Polynomial Testing

2.1 Testing via Random Restrictions and Implicit Learning

The "Test-Sparse-Poly" algorithm (0805.1765) operates as follows:

Random Partitioning and Independence Testing: Partition coordinates randomly into $r = \mathrm{poly}(s,1/\epsilon)$ blocks $\{I_1, \ldots, I_r\}$ . Estimate the variation $\operatorname{Vr}_f(I_j)$ in each block via Fischer’s independence test. Significant variation indicates the presence of high-influence variables.
Thresholding and Pruning Low-Variation Blocks: Selecting a random threshold $a$ , blocks are classified as 'high' or 'low' variation. Variables in low-variation blocks are fixed to zero, yielding a restricted function $f'$ . Structural results show that $f'$ is an $s$ -sparse polynomial on at most $\mathrm{poly}(s,1/\epsilon)$ variables and is close in distance to $f$ .
Simulation of Proper Exact Learning: Use the Schapire–Sellie proper exact learning algorithm on $f'$ , simulating membership queries via a subroutine ("SimMQ") that controls variable fixing in structured blocks.

The algorithm achieves $n \cdot \mathrm{poly}(s,1/\epsilon)$ time and $\mathrm{poly}(s,1/\epsilon)$ query complexity, improving substantially over previous work that required exponential time in $s$ for the learning phase (0805.1765).

2.2 Learn-Then-Test Reductions and Near-Optimality

Bshouty's "learn-then-test" reduction (Bshouty, 2022) achieves nearly optimal testing rates by:

Applying a random hashing $\varphi : [n] \to [m]$ to dimension-reduce $f$ while preserving $s$ -sparsity.
Properly learning the $s$ -sparse image function $g : \{0,1\}^m \to \{0,1\}$ using a specialized proper learning algorithm with sublinear $1/\epsilon$ query complexity.
Lifting back the learned polynomial and using random samples to distinguish proximity to $s$ -sparse structure.

This method yields overall query complexity $\widetilde{O}(s/\epsilon)$ for $\epsilon = 1/s^{\beta}$ , $\beta > 3.404$ , matching lower bounds up to polylogarithmic factors (Bshouty, 2022).

2.3 Zero Testing for Sparse Polynomials over Finite Fields

For black-box zero testing over $S^N$ when $f$ has at most $t$ monomials, the sphere-of-influence theorem (Aichinger et al., 2023) ensures that, for any center $a \in S^N$ , there exists a point $b$ in the Hamming ball of radius $R = O(\log t)$ around $a$ with $f(b) \neq 0$ (assuming $f$ is nonzero). Querying $f$ on all points within this ball yields a deterministic zero-testing algorithm with query complexity $O((N s)^{\log_{\tau}(t)})$ , where $\tau = r/(r-1)$ and $r$ depends on the structure of $S$ (Aichinger et al., 2023).

3. Structural Results and Key Lemmas

Central to efficient sparsity testing over $\mathbb{F}_2$ is the structural lemma [(0805.1765), Theorem 3]: under a random partition and restriction (fixing low-variation blocks), every $s$ -sparse polynomial is well-approximated by a junta depending on $O(s \log s)$ variables. The crucial properties are:

No variable or block has variation in a narrow forbidden window around the random threshold.
Each "high-variation" block contains exactly one high-influence coordinate.
The restricted function is an $s$ -sparse polynomial with small error to $f$ .

For zero testing over finite fields, the sphere-of-influence lemma quantifies the proximity (in Hamming distance) to a nonzero output, yielding efficient hitting sets for identity testing (Aichinger et al., 2023).

4. Query and Time Complexity: Comparative Table

Reference	Setting	Query Complexity	Time Complexity
(0805.1765)	$s$ -sparse $\mathbb{F}_2$	$\mathrm{poly}(s,1/\epsilon)$	$n \cdot \mathrm{poly}(s,1/\epsilon)$
(Bshouty, 2022)	$s$ -sparse Boolean	$\widetilde{O}(s/\epsilon)$ for $\beta>3.404$	$\mathrm{poly}(n,s,1/\epsilon)$
(Aichinger et al., 2023)	$t$ -sparse, $\mathbb{F}_q$	$O((N s)^{\log_{\tau}(t)})$	Same as query complexity

Bshouty's tester achieves query complexity matching known lower bounds up to polylogarithmic factors, establishing near-optimality (Bshouty, 2022). The Diakonikolas–Lee–Matulef–Servedio–Wan tester is the first to achieve both efficient query and time complexity for Boolean sparse polynomials (0805.1765).

5. Methodological Innovations and Underlying Techniques

Implicit Learning: Replacing brute-force consistency search with a proper exact learning algorithm for $s$ -sparse polynomials allows simulation of membership queries and polynomial speedup (0805.1765).
Random Restriction: Restricting variables based on estimated variation systematically reduces the function's complexity, effectively producing a small junta on which exact learning is tractable.
Dimension Reduction via Hashing: Random hash-based projections enable robust reduction of the ambient dimension while preserving sparsity, crucial in the near-optimal tester design (Bshouty, 2022).
Hitting Sets and Spheres of Influence: For general fields, the sphere-of-influence theorem provides explicit deterministic hitting-set constructions for identity testing sparse polynomials (Aichinger et al., 2023).

6. Extensions and Connections

The zero-testing framework in (Aichinger et al., 2023) generalizes to systems of sparse equations by aggregating them into a single system and applying the Hamming ball search. This approach, based on combinatorial Nullstellensatz and absorbing polynomials, yields analogous complexity bounds for verifying the existence of solutions to systems $f_1 = \cdots = f_r = 0$ over $S^N$ .

The connections between property testing, exact learning, and algebraic coding theory suggest further research into testing richer function classes (e.g., higher-degree polynomials, approximate sparsity), black-box polynomial identity testing, and derandomization of testing algorithms. A plausible implication is that advances in proper learning algorithms directly inform the design of faster testers for sparse structures.

7. Historical Development and Optimality Considerations

Early approaches to sparsity testing relied on testing by implicit learning [DLM+07], which decoupled the detection of structure (via independence tests) from the learning of the restricted function, but incurred exponential time. The integration of the Schapire–Sellie exact learner and refined structural analysis in (0805.1765) made polynomial-time testing feasible.

The refined "learn-then-test" approach and dimension reduction techniques in (Bshouty, 2022) set the current optimal frontiers for query and time complexity. Lower bounds established in (Bshouty, 2022) confirm the necessity of $\Omega(s/\epsilon)$ scaling, providing a benchmark for all future algorithms. The extension to finite field evaluation domains and hitting-set analyses in (Aichinger et al., 2023) expands the relevance of sparsity testing to zero testing and systems of equations in algebra and coding theory.

PDF Markdown Chat (Pro)

References (3)

Efficiently Testing Sparse GF(2) Polynomials (2008)

Almost Optimal Proper Learning and Testing Polynomials (2022)

Zero testing and equation solving for sparse polynomials on rectangular domains (2023)

Whiteboard

Generate a whiteboard explanation of this topic.

Topic to Video (Beta)

Generate a video overview of this topic.

Follow Topic

Get notified by email when new papers are published related to Sparsity Testing for Low-Degree Polynomials.