Poison Budget Lower Bound in Robust Learning

Updated 12 February 2026

Poison Budget Lower Bound is a measure that quantifies the minimum adversarial modifications required to degrade statistical learning accuracy and certification guarantees.
It reveals trade-offs between adversarial power, sample complexity, and learning rates across regression, classification, and bandit models.
Analysis using mass-shifting and combinatorial techniques validates these bounds, guiding the design of robust defense mechanisms.

The poison budget lower bound quantifies the minimum number, or fraction, of data points an adversary must alter in a training dataset to force a statistical learning algorithm to suffer nontrivial degradation in accuracy, robustness, or certification guarantees. Precise lower bounds expose the inherent trade-offs between adversarial power, sample complexity, learning rates, and algorithmic robustness for a variety of statistical tasks and models. These bounds are fundamental to understanding the limits of robust learning, as they delineate what is achievable regardless of algorithmic improvements, and are frequently matched (up to logarithmic factors) by attack and defense constructions.

1. Formal Problem Setup and Model Variants

Poison budget lower bounds are classically studied under settings where an adversary, subject to a constraint (q-modification, η-fraction of points, clean-label or arbitrary-value replacements), seeks to maximize risk or force specific prediction failures after learning. Key problem parameters include:

Dataset size (N, m, n): Number of training samples.
Poison budget (q or η): Absolute number $q$ or fraction $\eta$ of points the adversary can manipulate.
Attack model: White-box/sample-aware modification, clean-label vs. arbitrary-label, instance-targeted vs. global, malicious noise (random corruption) vs. worst-case selection.

Foundational threat models include the "poison-q" setting (replace up to $q$ samples in a dataset of size $N$ ), fraction-based η-budgeted adversaries, and task-specific objectives in regression, classification, and bandit algorithms (Zhao et al., 2023, Balcan et al., 2022, Chornomaz et al., 3 Jun 2025).

2. Representative Lower Bounds: Precise Forms and Scaling

Nonparametric Regression (Hölder Class, ℓ₀ Poisoning)

For robust regression over the Hölder class $\Sigma(\alpha, L)$ on $\mathbb{R}^d$ , under a "poison-q" attack (replace up to q out of N samples), the minimax lower bounds on estimation error are (Zhao et al., 2023):

$R_2(N, q; \alpha, d) \ge c\,\left\{ (q/N)^{(2\alpha + d)/(d+1)} + N^{-2\alpha/(2\alpha + d)} \right\}$

$R_\infty(N, q; \alpha, d) \ge c\,\left\{ (q/N)^{\alpha/(d+1)} + N^{-\alpha/(2\alpha + d)} \right\}$

These establish regime changes: for $q \lesssim N^{d/(2\alpha + d)}$ , risk is statistically limited; for $q \gtrsim N^{d/(2\alpha + d)}$ , attack-limited rates dominate.

Classification: Robustly-Reliable Learners

In the agnostic and realizable PAC learning settings under instance-targeted and global poisoning, sharp lower bounds depend on VC-dimension $d$ :

Agnostic, instance-targeted: To force excess error $\varepsilon$ , the minimal necessary corruption fraction is $\eta = \Omega(\varepsilon^2/d)$ . The associated excess error scales as $\Omega(\sqrt{d\eta})$ (Chornomaz et al., 3 Jun 2025).
Region-certifiably robust learning: For robustly-reliable algorithms, the certifiable region cannot exceed the agreement region of hypotheses with error at most $2\eta$ . For halfspaces under isotropic log-concave distributions, this implies that to render a constant-fraction $c>0$ of the input space unpredictable, one needs $\eta = \Omega(c/\sqrt{d})$ (Balcan et al., 2022).

Model-targeted Convex Learning (ERM)

Given a convex per-example loss $l(\theta; x, y)$ , dataset size $N$ , and target model parameter $\theta_p$ , the minimal poison budget $m$ to make $\theta_p$ the minimizer for the empirical objective is tightly lower-bounded by (Suya et al., 2020):

$m \ge \sup_{\theta \in \Theta} \frac{L(\theta_p; D_c) - L(\theta; D_c) + N C_R [R(\theta_p) - R(\theta)]}{\sup_{x,y}\left[l(\theta; x, y) - l(\theta_p; x, y)\right] + C_R [R(\theta) - R(\theta_p)]}$

For canonical cases such as hinge-loss SVMs or logistic regression, this scales as $m = \Omega(1/\epsilon)$ when attempting to reach an $\epsilon$ -level discrepancy.

3. Proof Techniques and Tightness

Packing/Le Cam Mass-Shifting

Many lower bounds are realized via hard two-point constructions: building two functions, models, or distributions that are distinguishable only within a small region, matched to the poison budget. If the adversary can confound the algorithm sufficiently within this region ("mass-shifting"), minimax error bounds as above follow. For bandits, lower bounds are established by quantifying the minimal shift required to bias empirical means enough to overturn regret bounds (Rangi et al., 2021).

Combinatorial Arguments

In robustly-reliable learning, bounds are set by combinatorial properties of the hypothesis class, such as the disagreement region and VC-dimension (e.g., agreement balls of radius $2\eta$ under empirical or population error). Matching upper and lower bounds are often achieved by ERM-based algorithms, certifying the sharpness of these thresholds (Balcan et al., 2022).

Special Techniques: Bandits

For stochastic bandits, a contamination budget of $C = \Omega(\log T)$ is both necessary and sufficient to force any order-optimal (log-regret) algorithm to linear regret, with the classical UCB analysis formalizing this threshold (Rangi et al., 2021).

4. Task-specific Regimes and Parameter Dependencies

The following table summarizes principal scaling dependencies in several settings.

Learning Setting	Minimal Posion Budget	Error Target Scaling
Nonparametric regression ( $\ell_\infty$ )	$q = \Omega(N\epsilon^{(d+1)/\alpha})$	$\mathbb{E}[\|\|\hat f - f\|\|_\infty] \gtrsim \epsilon$
Agnostic learning (VC-d, targeted)	$\eta = \Omega(\epsilon^2/d)$	excess error $\Omega(\sqrt{d\eta})$
Robust region-certification	$\eta = \Omega(c/\sqrt{d})$	Region of measure $c$ uncertifiable
Convex ERM ( $\epsilon$ -target)	$m = \Omega(1/\epsilon)$	Empirical loss gap $\geq \epsilon$
Bandit (linear regret)	$C = \Omega(\log T)$	Regret $\Omega(T)$

A salient implication is that for complex hypothesis classes (large $d$ ), the adversary's minimal budget shrinks rapidly—a general manifestation of the curse of dimensionality for robust learning.

5. Implications for Robust Learning and Defense Mechanisms

These lower bounds demonstrate that, irrespective of learning algorithm, certain adversarial power thresholds must be exceeded to break statistical or certification guarantees. In practice, matching upper bounds indicate that robust estimators, ERM + Lipschitz projection, and aggregation mechanisms (e.g., bagging) offer near-optimal resistance, but can be fundamentally compromised once these poison budgets are reached (Zhao et al., 2023, Chen et al., 2022). For bagging ensembles, the lower bound to simultaneously flip $k$ predictions is given (in the "no-overlap" regime) by $B_{\rm lb}(k) = \lceil k / (\max_i|\mathcal{S}_i|) \rceil$ , directly tying ensemble structure to attack resilience (Chen et al., 2022).

6. Special Cases: Hybrid and Evasion-Poison Attacks

Hybrid attacks, combining a small poison budget with nontrivial test-time evasion, render PAC learning impossible under error-region risk, even when allowed poisoning fraction is inverse-polynomial in dataset size (Diochnos et al., 2019). This exposes the extreme brittleness of robust PAC learning under certain adversarial threat models: any nonzero allowed poison fraction is theoretically fatal for robust learnability in expressive hypothesis classes.

7. Significance, Optimality, and Research Directions

Poison budget lower bounds are minimax-optimal in the settings discussed, typically achieved by explicit adversary constructions and defense algorithms. Their tightness has been validated empirically across real-world data and models (Suya et al., 2020, Chornomaz et al., 3 Jun 2025). These results underlie much of the contemporary theory motivating certified robustness, adversarial machine learning, and algorithmic design for high-stakes deployment, and continue to inform both foundational limits and practical defense strategies.

Markdown Report Issue Upgrade to Chat

References (7)

Robust Nonparametric Regression under Poisoning Attack (2023)

Robustly-reliable learners under poisoning attacks (2022)

Agnostic Learning under Targeted Poisoning: Optimal Rates and the Role of Randomness (2025)

Model-Targeted Poisoning Attacks with Provable Convergence (2020)

Saving Stochastic Bandits from Poisoning Attacks via Limited Data Verification (2021)

On Collective Robustness of Bagging Against Data Poisoning (2022)

Lower Bounds for Adversarially Robust PAC Learning (2019)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Poison Budget Lower Bound.

Poison Budget Lower Bound in Robust Learning

1. Formal Problem Setup and Model Variants

2. Representative Lower Bounds: Precise Forms and Scaling

Nonparametric Regression (Hölder Class, ℓ₀ Poisoning)

Classification: Robustly-Reliable Learners

Model-targeted Convex Learning (ERM)

3. Proof Techniques and Tightness

Packing/Le Cam Mass-Shifting

Combinatorial Arguments

Special Techniques: Bandits

4. Task-specific Regimes and Parameter Dependencies

5. Implications for Robust Learning and Defense Mechanisms

6. Special Cases: Hybrid and Evasion-Poison Attacks

7. Significance, Optimality, and Research Directions

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Poison Budget Lower Bound in Robust Learning

1. Formal Problem Setup and Model Variants

2. Representative Lower Bounds: Precise Forms and Scaling

Nonparametric Regression (Hölder Class, ℓ₀ Poisoning)

Classification: Robustly-Reliable Learners

Model-targeted Convex Learning (ERM)

3. Proof Techniques and Tightness

Packing/Le Cam Mass-Shifting

Combinatorial Arguments

Special Techniques: Bandits

4. Task-specific Regimes and Parameter Dependencies

5. Implications for Robust Learning and Defense Mechanisms

6. Special Cases: Hybrid and Evasion-Poison Attacks

7. Significance, Optimality, and Research Directions

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research