Multi-Sample Lower Bounds in Learning

Updated 10 February 2026

Multi-sample lower bounds are theoretical limits that formalize the trade-off between sample complexity and computational resources in various learning and inference settings.
They are established using methods such as communication complexity reductions, potential function tracking, and information divergence arguments.
These bounds have broad applications in streaming, query complexity, high-dimensional statistics, and quantum algorithms, guiding optimal algorithm design.

A multi-sample lower bound is a foundational concept in computational learning theory, streaming algorithms, high-dimensional statistics, and quantum information, quantifying the trade-off between the number of independent examples (samples) and computational resources (such as memory, query complexity, or space) necessary for a given inference or learning task. Such lower bounds establish fundamental impossibility results, asserting that small-memory, low-query, or low-space algorithms must incur an exponential or super-polynomial sample complexity to achieve non-negligible performance, or vice versa. These bounds are typically established through reductions to communication or information complexity, divergence arguments, and extractor or packing constructions, providing quantitative hardness results that often generalize to wide classes of problems.

1. Formal Framework and Notion of Multi-Sample Lower Bounds

The multi-sample lower bound paradigm arises across several computational models:

Streaming Models: Learners/processors receive a stream of i.i.d. (or adaptively chosen) examples and are restricted in memory, number of passes, or processing time.
Query Complexity: Algorithms probe an oracle-supplied function or distribution with bounded numbers of adaptive/non-adaptive queries.
Statistical Estimation: Sample complexity lower bounds set the minimum number of independent samples required under resource constraints.
Quantum Algorithms: Quantum sample-to-query lifting quantifies the relation between the number of quantum samples and queries needed for property testing.

A general multi-sample lower bound states that for a class of problems (e.g., learning parities, detecting distributional change, density estimation, property testing), at least one of the following must hold:

The algorithm uses super-polynomial (often exponential) number of samples,
The algorithm requires large (typically quadratic or more) memory/space/query resources,
The algorithm succeeds with only negligible probability.

These trade-offs are sharply characterized for classic problems such as parity learning (Lyu et al., 2023), change detection in graphical models (Gangrade et al., 2017), density estimation (Aamand et al., 2024), streaming frequency estimation (Lovett et al., 2023), quantum property testing (Wang et al., 2023), and several others.

2. Core Methodologies for Establishing Multi-Sample Lower Bounds

The construction and proof of multi-sample lower bounds rely on a set of architectural techniques:

Branching Program and Communication Complexity Reductions: A streaming learner is modeled as a bounded-width, multi-pass branching program, and the learning or distinguishing task is reduced to a fundamental communication problem such as universal relation, unique set-disjointness or parity search (Lyu et al., 2023, Lovett et al., 2023, Nelson et al., 2017).
Potential and Bias-Counter Methods: The evolution of the learner's internal state is tracked via a potential function (such as the ℓ₂-norm of posterior distributions or squared bias), and key steps use drift or enlargement arguments to bound information concentration (Lyu et al., 2023).
Transfer Lemmas: To lift single-pass bounds to multi-pass models, transfer lemmas bridge the progress possible across passes, enabling iteration of inductive argumentation (Lyu et al., 2023).
Information/Divergence Bounds: χ²-divergence or mutual information arguments show that the advantage of any small-memory or low-query algorithm in distinguishing or detecting structure is exponentially small unless resource constraints are met (Gangrade et al., 2017, Fawzi et al., 2023, Chewi et al., 2022).
Extractor and Packing Constructions: Extractor matrices or hard packing of parameters (e.g., low-bias Fourier spectra, or high-mass bumps in sampling) establish that the observed evidence remains indistinguishable without high-complexity or high-query regimes (Lyu et al., 2023, Chewi et al., 2022).
Sample-to-Query Lifting (Quantum): Lifting theorems relate sample complexity to query complexity by simulating queries with block-encodings built from samples, resulting in quadratic relations between the two (Wang et al., 2023).

3. Canonical Results and Problem-Specific Bounds

This section summarizes archetypal multi-sample lower bounds across representative problems.

Problem Domain	Lower Bound Type	Resource Trade-off	Reference
Parity learning (q-pass streaming)	Memory vs. sample (tight curve)	$M \ge n^2/c_q$ or $T \ge 2^{\Omega(n/c_q)}$ ; $M \cdot \log M + T \ge \exp(\Omega(n))$	(Lyu et al., 2023)
Change detection (Ising, GMRF)	Sample size per instance	$\Omega(\frac{d^2}{(\log d)^2} \log p)$ (Ising); $\Omega(\gamma^{-2} \log p)$ (Gaussian)	(Gangrade et al., 2017)
Streaming frequency estimation	Space-sample trade-off	$s = \Omega(1/(p^2 t \log t))$	(Lovett et al., 2023)
Density estimation (data structures)	Query time vs. sample bound	$S = O(n/\log^c k)$ samples $\implies$ $T \geq k^{1-O(1)/\log\log k}$	(Aamand et al., 2024)
Quantum property testing/lifting	Query vs. sample complexity (quadratic)	$Q = \widetilde{\Omega}(\sqrt{S})$	(Wang et al., 2023)
Learning Pauli channels	Channel uses (adaptive vs. non-adaptive)	$T \ge \Omega(2^{2n}/\epsilon^2)$ (adaptive), $T \ge \Omega(2^{3n}/\epsilon^2)$ (non-adaptive)	(Fawzi et al., 2023)
Sampling from non-log-concave distributions	Oracle query complexity	Large-FI: $\Omega(K_0/d)$ ; Small-FI: $\Omega(\textrm{poly}(1/\epsilon))$	(Chewi et al., 2022)
Log-concave sampling	First-order queries	Constant $d$ : $\Omega(\log \kappa)$ ; General: $\widetilde{\Omega}(\min(\sqrt{\kappa} \log d, d))$	(Chewi et al., 2023)

These results are often tight, with matching or nearly-matching algorithms provided in the respective works.

4. Generalizations and Extensions

The machinery supporting multi-sample lower bounds frequently extends to broader classes of problems:

Extractor-Based Concept Classes: Any concept class whose associated evaluation matrix is a good $L_2$ -extractor (e.g., DNFs, juntas, decision trees, low-degree polynomials) satisfies the same lower bounds as parity learning (Lyu et al., 2023).
Multi-Sample and Multi-Instance Problems: The lower bound techniques generalize to estimating $k$ outputs, $k$ -wise sampling, or distinguishing in $k$ -party set-disjointness settings (Nelson et al., 2017, Lovett et al., 2023).
Parameter Sensitivity (Noise/Condition Number): The sample lower bounds may scale polynomially or exponentially in signal-to-noise ratio, condition number, parameter gaps, minimum coupling, or other structural parameters (Gangrade et al., 2017, Balanov et al., 21 Jan 2025, Chewi et al., 2023).
Hybridization and Adaptive Analysis: For certain cryptographic primitives or local pseudorandom generators, time-space trade-offs are established even under adaptive adversaries or streaming cryptanalytic settings, using hybrid arguments and Fourier analysis of predicates (Garg et al., 2020).

5. Consequences for Algorithms and Complexity

These lower bounds yield several critical insights for theory and practice:

Tightness and Achievability: For most canonical problems, the established lower bounds match known algorithmic upper bounds up to logarithmic or constant factors, demonstrating the optimality of prevailing techniques (e.g., Flammia–Wallman’s Pauli learning procedure, block Krylov Gaussian sampling, autocorrelation-based multi-target detection) (Lyu et al., 2023, Fawzi et al., 2023, Aamand et al., 2024, Balanov et al., 21 Jan 2025, Chewi et al., 2023).
Trade-off Frontiers: The explicit trade-off curves (e.g., for $q$ -pass parity, $M \ge n^2/c_q$ or $T \ge 2^{n/c_q}$ ) set hard frontiers for algorithm design, quantifying the infeasibility of breaking the time-space or sample-memory barrier through algorithmic ingenuity alone (Lyu et al., 2023).
Optimality of "Learn-then-Compare" or Two-Step Approaches: For Ising and Gaussian change detection, the naive approach of fully learning both structures and then performing a comparison is minimax optimal in broad regimes; only in restricted models do direct change detection methods yield improvements (Gangrade et al., 2017).
Quantum Quadratic Gap: The sample-to-query lifting theorem establishes a universal quadratic gap between quantum sample complexity and quantum query complexity for property testing, and central lower bounds for phase estimation, Gibbs sampling, and entropy estimation can all be seen as applications (Wang et al., 2023).
Cryptographic Strength: Bounded-memory lower bounds for distinguishing cryptographically generated sequences demonstrate that security can be based on resilience of predicates and stretch, justifying stream ciphers and PRGs against memory-limited adversaries (Garg et al., 2020).
Complexity Separation: A regime-dependent separation is evident—for example, between log-concave and non-log-concave sampling at high precision, or between MDPs and multi-step revealing POMDPs (where, for the latter, regret suffers a $\Omega(T^{2/3})$ lower bound, ruling out $\sqrt{T}$ rates) (Chewi et al., 2022, Chen et al., 2023).

Multi-sample lower bounds have shaped the theoretical boundaries for learning, streaming algorithms, quantum computing, statistical inference, cryptography, and average-case complexity. Their ramifications include:

Benchmarking Algorithmic Optimality: By establishing lower bounds that match, up to logarithmic factors, the best explicit algorithms, these results provide benchmarks for algorithm developers and clarify inherent limits across computational paradigms.
Guidance for Model Selection: Recognizing when direct (single-step) methods cannot outperform naive approaches unless specific structure or incoherence conditions are available (e.g., in structural change detection) helps focus research on meaningful model restrictions (Gangrade et al., 2017).
Design of Resilient Cryptographic Primitives: Multi-sample lower bounds identify resilience levels necessary for achieving bounded-memory security with super-linear stretch (Garg et al., 2020).
Quantum Algorithm Lower Bounds: The unification of quantum sample and query lower bounds informs quantum algorithm design for property testing, phase estimation, and simulation tasks (Wang et al., 2023).
Open Questions: Improvements in extractor constructions, tighter sample-efficient testing, sharper high-dimensional statistical lower bounds, and quantum-classical gap separations remain active areas.

7. Representative Examples

The following illustrates paradigmatic results in more detail:

Parity Learning (q-Pass): For any constant $q$ , every q-pass streaming learner for $n$ -bit parity, using $M$ bits of memory and $T$ samples, must obey either $M \geq n^2/c_q$ or $T \geq 2^{n/c_q}$ , with $c_q = 100^{3^q}$ , and the success probability falls off exponentially if both are violated (Lyu et al., 2023).
Density Estimation: In the list-of-points model, any data structure using $O(n/\log^c k)$ samples must resort to nearly linear time in the database size $k$ , with $T \geq k^{1-O(1)/\log\log k}$ (Aamand et al., 2024).
Pauli Channel Learning: For $n$ -qubit Pauli channels, non-adaptive ancilla-free protocols require $T \geq \Omega(2^{3n}/\epsilon^2)$ channel uses; adaptive protocols require at least $\Omega(2^{2n}/\epsilon^2)$ when $\epsilon$ exceeds the inverse dimension (Fawzi et al., 2023).
High-Noise Multi-Target Detection: Generic group-action multi-target detection in high noise requires $\omega(\sigma^6)$ samples for identifiability, matching the sample complexity of corresponding multi-reference alignment problems (Balanov et al., 21 Jan 2025).
Quantum Sample-to-Query Lifting: For any quantum property testing problem, the quantum query complexity lower bound is at least $\widetilde{\Omega}(\sqrt{S})$ if the sample complexity is $S$ , with tightness for state discrimination and matching known bounds for Gibbs sampling, phase estimation, amplitude estimation, and entanglement entropy testing (Wang et al., 2023).

In summary, multi-sample lower bounds constitute a critical toolkit for certifying the hardness of inference, learning, and decision problems under concrete resource constraints, providing sharp phase transitions, and guiding optimal algorithm design across a wide spectrum of modern computational and statistical disciplines.