Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 175 tok/s
Gemini 2.5 Pro 54 tok/s Pro
GPT-5 Medium 38 tok/s Pro
GPT-5 High 37 tok/s Pro
GPT-4o 108 tok/s Pro
Kimi K2 180 tok/s Pro
GPT OSS 120B 447 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

Interval Bound Propagation (IBP)

Updated 12 November 2025
  • Interval Bound Propagation (IBP) is a method that propagates input uncertainty through neural network layers using interval arithmetic to yield certified output bounds.
  • It integrates directly into the training loss to enforce robust margins and achieve provable robustness against norm-bounded perturbations.
  • Enhancements like CROWN, LBP, and ParamRamp activation improve bound tightness and verified accuracy on datasets such as MNIST, CIFAR-10, and Tiny-ImageNet.

Interval Bound Propagation (IBP) is a scalable and analytically grounded technique for training and verifying neural networks with provable robustness guarantees to norm-bounded or structured input perturbations. It operates by forward-propagating axis-aligned interval bounds, constructed over an input uncertainty set (typically an p\ell_p-ball), through each layer of the network using interval arithmetic. This yields certified elementwise lower and upper bounds on the outputs for all admissible input perturbations. IBP is computationally lightweight and differentiable, enabling its use directly in the loss function during robust training. Despite its practical successes—such as state-of-the-art verified robust accuracy on MNIST, CIFAR-10, and Tiny-ImageNet—IBP's looseness (especially the “wrapping effect” caused by compounding over-approximations) and associated limitations have motivated theoretical investigations, practical algorithmic improvements, and innovations in model architecture and training protocols.

1. Formal Definition and Layerwise Propagation

Consider a feed-forward network of mm layers: z(k)=W(k)a(k1)+b(k),a(k)=σ(z(k)),k=1,,mz^{(k)} = W^{(k)} a^{(k-1)} + b^{(k)}, \quad a^{(k)} = \sigma(z^{(k)}), \quad k=1, \dots, m with a(0)=xa^{(0)} = x (the input), and σ\sigma elementwise monotonic (e.g., ReLU). Let the input belong to a convex uncertainty set, usually an p\ell_p-ball,

Bp(x0,ϵ)={x:xx0pϵ}.\mathbb{B}_p(x_0, \epsilon) = \{x : \|x - x_0\|_p \leq \epsilon\}.

IBP maintains for each layer kk a lower bound z(k)\underline{z}^{(k)} and upper bound z(k)\overline{z}^{(k)} such that for every xx in Bp(x0,ϵ)\mathbb{B}_p(x_0,\epsilon) and neuron ii: zi(k)zi(k)(x)zi(k).\underline{z}_i^{(k)} \leq z_i^{(k)}(x) \leq \overline{z}_i^{(k)}. The bounds are recursively computed as follows:

  • Input Layer (Base Case): By Hölder’s inequality,

z(1)=W(1)x0+b(1)ϵW(1)q,z(1)=W(1)x0+b(1)+ϵW(1)q,    where  1/p+1/q=1.\underline{z}^{(1)} = W^{(1)}x_0 + b^{(1)} - \epsilon \|W^{(1)}\|_q,\quad \overline{z}^{(1)} = W^{(1)}x_0 + b^{(1)} + \epsilon \|W^{(1)}\|_q, \;\; \text{where}\; 1/p + 1/q = 1.

  • Affine Layers: For k>1k>1,

z(k)=ReLU(W(k))a(k1)+Neg(W(k))a(k1)+b(k) z(k)=ReLU(W(k))a(k1)+Neg(W(k))a(k1)+b(k)\begin{aligned} \underline{z}^{(k)} &= \mathrm{ReLU}(W^{(k)})\,\underline{a}^{(k-1)} + \mathrm{Neg}(W^{(k)})\,\overline{a}^{(k-1)} + b^{(k)} \ \overline{z}^{(k)} &= \mathrm{ReLU}(W^{(k)})\,\overline{a}^{(k-1)} + \mathrm{Neg}(W^{(k)})\,\underline{a}^{(k-1)} + b^{(k)} \end{aligned}

with ReLU(W)ij=max{Wij,0}\mathrm{ReLU}(W)_{ij} = \max\{W_{ij},0\} and Neg(W)ij=min{Wij,0}\mathrm{Neg}(W)_{ij} = \min\{W_{ij},0\}.

  • Activation Layers: For any monotonic activation σ\sigma,

a(k)=σ(z(k)),a(k)=σ(z(k)).\underline{a}^{(k)} = \sigma(\underline{z}^{(k)}),\quad \overline{a}^{(k)} = \sigma(\overline{z}^{(k)}).

This process iterates forward, yielding at the output layer an enclosure for each logit.

2. Certified Robust Training and Optimization Objective

To train provably robust models, IBP is applied within the optimization objective to lower-bound the classification margin

ωi(x)=zy(m)(x)zi(m)(x),    (iy)\omega_i(x) = z^{(m)}_y(x) - z^{(m)}_i(x), \;\; (i \neq y)

uniformly over all xx in the input uncertainty set. The tightest certified lower margin is

ωIBP(x0,ϵ)=miniy[zy(m)zi(m)].\underline{\omega}^{\mathrm{IBP}}(x_0, \epsilon) = \min_{i\neq y} \left[ \underline{z}^{(m)}_y - \overline{z}^{(m)}_i \right].

The robust training loss is constructed as a mixture of standard cross-entropy on clean predictions and a penalty on the worst-case margin: E(x0,y)D[κL(z(m)(x0),y)+(1κ)L(ωIBP(x0,ϵ),y)],\mathbb{E}_{(x_0, y) \sim \mathcal{D}} \left[ \kappa L(z^{(m)}(x_0), y) + (1 - \kappa) L(-\underline{\omega}^{\mathrm{IBP}}(x_0, \epsilon), y) \right], where κ[0,1]\kappa \in [0,1] controls the trade-off. Since ωIBP\underline{\omega}^{\mathrm{IBP}} is differentiable with respect to network parameters, the entire process supports standard backpropagation and stochastic gradient descent (Lyu et al., 2021, Gowal et al., 2018).

Schedule annealing for ϵ\epsilon (ramp-up) and κ\kappa (ramp-down) is essential; early gradual increase in ϵ\epsilon and slow mixing of robust loss (κ\kappa) are crucial for convergence and tight final bounds (Gowal et al., 2018, Morawiecki et al., 2019).

3. Comparison with CROWN and Linear Bound Propagation (LBP)

CROWN is a bounding method that uses tight linear (affine) relaxations for nonlinear activations—specifically, it replaces σ(z)\sigma(z) by parallel upper and lower linear functions over each interval [l,u][l, u]: hL(z)σ(z)hU(z).h^L(z) \leq \sigma(z) \leq h^U(z). CROWN then backpropagates the effect of these relaxations via dual variables to lower layers, yielding linear enclosures for network outputs.

Selecting the constant bounding lines hL(z)=σ(l)h^L(z) = \sigma(l), hU(z)=σ(u)h^U(z) = \sigma(u) in CROWN causes it to collapse to the IBP bound. Using the tightest zero-intercept lines through (l,σ(l))(l, \sigma(l)) and (u,σ(u))(u, \sigma(u)) produces strictly tighter relaxations at unstable ReLU neurons (spanning zero). The formal result (Theorem 3.3) is

[zIBP(m),zIBP(m)][zCROWN(m),zCROWN(m)],[\underline{z}^{(m)}_{\mathrm{IBP}}, \overline{z}^{(m)}_{\mathrm{IBP}}] \supseteq [\underline{z}^{(m)}_{\mathrm{CROWN}}, \overline{z}^{(m)}_{\mathrm{CROWN}}],

i.e., CROWN's bounds are always tighter or equal to IBP's when using tight bounding lines (Lyu et al., 2021).

LBP interpolates between full CROWN and IBP by truncating back-propagation of linear relaxations to a smaller number of layers—a pure LBP step back-propagates across only one nonlinear layer per bound, retaining O(m)O(m) memory and computational cost. LBP's bounds are at least as tight as IBP's (when using tight bounding lines), with strictly lower verified error in empirical studies (Lyu et al., 2021).

4. Activation Function Impact: The ParamRamp Mechanism

IBP-trained ReLU networks often exhibit a preponderance of “dead” neurons—neurons whose pre-activations become entirely inactive (0\leq 0) or robustly active (r\geq r), reducing representational capacity and certified performance. ParamRamp generalizes LeakyReLU with a learnable ramp point r>0r > 0: ParamRamp(z;r,α)={αz,z0 z,0<zr r+β(zr),z>r\mathrm{ParamRamp}(z; r, \alpha) = \begin{cases} \alpha z, & z \leq 0 \ z, & 0 < z \leq r \ r + \beta(z - r),& z > r \end{cases} where α\alpha, β[0,1)\beta \in [0,1). The presence of three regimes enables more diverse neuron “statuses” under interval certification. Application within IBP or LBP requires segment-wise linear bounding on [,0][-\infty, 0], [0,r][0, r], and [r,][r, \infty] (see Appendix A.7 in (Lyu et al., 2021)).

Empirical results demonstrate that ParamRamp improves both verified and adversarial robustness: for example, on MNIST (ϵ=0.4\epsilon=0.4), IBP-verified error reduces from 14.8%\approx 14.8\% (ReLU) to 10.9%\approx 10.9\% (ParamRamp), and PGD error from 11.1%\approx 11.1\% to 6.6%6.6\%. On CIFAR-10 (ϵ=2/255\epsilon=2/255 and $8/255$), the improvements range up to $7$–$8$ points in verified error. On Tiny-ImageNet (ϵ=1/255\epsilon=1/255), a WideResNet with ParamRamp achieves an IBP-verified error of 82.94%82.94\%, reported as the best to date (Lyu et al., 2021).

5. Empirical Analysis and Performance Characteristics

Extensive experiments on MNIST, CIFAR-10, and Tiny-ImageNet show:

  • LBP, and especially CROWN-LBP, provide tighter verification bounds (lower verified error) than plain IBP in both evaluation and post-hoc certification with negligible computational overhead increase.
  • Networks trained with ParamRamp activation systematically outperform ReLU networks on verified robustness, with the largest gains on more challenging datasets and perturbation regimes.
  • On Tiny-ImageNet (ϵ=1/255\epsilon=1/255, WideResNet), the best-reported IBP-verified error is 82.94%82.94\% (Lyu et al., 2021).
  • The combination of IBP-trained models (with ParamRamp) and LBP verification achieves the lowest known verified errors across multiple datasets and domain perturbations.

The following table summarizes key robust error rates:

Dataset Method Activation ϵ\epsilon IBP-verified error (%) PGD Error (%)
MNIST IBP ReLU 0.4 14.8 11.1
MNIST IBP ParamRamp 0.4 10.9 6.6
CIFAR-10 IBP ReLU 2/255–8/255
CIFAR-10 IBP+LBP+CROWN+ParamRamp ParamRamp 2/255–8/255 improvement $7$–$8$ improvement
Tiny-ImageNet IBP+ParamRamp (WideResNet) ParamRamp 1/255 82.94

All methods retain minimal additional training or verification overhead compared to canonical IBP (Lyu et al., 2021).

6. Theoretical and Practical Implications

This line of research establishes:

  • The existence of a strict performance hierarchy among IBP, (CROWN-)LBP, and CROWN, with IBP always looser unless CROWN adopts constant bounding lines.
  • The practical feasibility of scaling tight, linear-relaxation-inspired verification (LBP) to large-scale, high-dimensional problems at nearly the same cost as IBP, enabling routine post-hoc certification and improved certified accuracy.
  • Activation choice, specifically ParamRamp, substantially improves bound tightness and expressivity of certified models, mitigating the collapse into dead neuron regimes endemic to IBP-ReLU training.
  • The empirical finding that verified robust error can be substantially reduced (by $4$–$8$ percentage points) by integrated architectural and bound propagation advances without requiring new optimizer infrastructure.

In summary, Interval Bound Propagation is a foundational technique for training certifiably robust neural networks. When combined with advances in linear bound propagation and suitable activations like ParamRamp, it supports both improved robustness certificates and scalability to deep, expressive architectures, while retaining the computational efficiency and analytic tractability essential for practical robust ML deployments (Lyu et al., 2021).

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Interval Bound Propagation (IBP).