Interval Bound Propagation (IBP)

Updated 12 November 2025

Interval Bound Propagation (IBP) is a method that propagates input uncertainty through neural network layers using interval arithmetic to yield certified output bounds.
It integrates directly into the training loss to enforce robust margins and achieve provable robustness against norm-bounded perturbations.
Enhancements like CROWN, LBP, and ParamRamp activation improve bound tightness and verified accuracy on datasets such as MNIST, CIFAR-10, and Tiny-ImageNet.

Interval Bound Propagation (IBP) is a scalable and analytically grounded technique for training and verifying neural networks with provable robustness guarantees to norm-bounded or structured input perturbations. It operates by forward-propagating axis-aligned interval bounds, constructed over an input uncertainty set (typically an $\ell_p$ -ball), through each layer of the network using interval arithmetic. This yields certified elementwise lower and upper bounds on the outputs for all admissible input perturbations. IBP is computationally lightweight and differentiable, enabling its use directly in the loss function during robust training. Despite its practical successes—such as state-of-the-art verified robust accuracy on MNIST, CIFAR-10, and Tiny-ImageNet—IBP's looseness (especially the “wrapping effect” caused by compounding over-approximations) and associated limitations have motivated theoretical investigations, practical algorithmic improvements, and innovations in model architecture and training protocols.

1. Formal Definition and Layerwise Propagation

Consider a feed-forward network of $m$ layers: $z^{(k)} = W^{(k)} a^{(k-1)} + b^{(k)}, \quad a^{(k)} = \sigma(z^{(k)}), \quad k=1, \dots, m$ with $a^{(0)} = x$ (the input), and $\sigma$ elementwise monotonic (e.g., ReLU). Let the input belong to a convex uncertainty set, usually an $\ell_p$ -ball,

$\mathbb{B}_p(x_0, \epsilon) = \{x : \|x - x_0\|_p \leq \epsilon\}.$

IBP maintains for each layer $k$ a lower bound $\underline{z}^{(k)}$ and upper bound $\overline{z}^{(k)}$ such that for every $x$ in $\mathbb{B}_p(x_0,\epsilon)$ and neuron $i$ : $\underline{z}_i^{(k)} \leq z_i^{(k)}(x) \leq \overline{z}_i^{(k)}.$ The bounds are recursively computed as follows:

Input Layer (Base Case): By Hölder’s inequality,

$\underline{z}^{(1)} = W^{(1)}x_0 + b^{(1)} - \epsilon \|W^{(1)}\|_q,\quad \overline{z}^{(1)} = W^{(1)}x_0 + b^{(1)} + \epsilon \|W^{(1)}\|_q, \;\; \text{where}\; 1/p + 1/q = 1.$

Affine Layers: For $k>1$ ,

$\begin{aligned} \underline{z}^{(k)} &= \mathrm{ReLU}(W^{(k)})\,\underline{a}^{(k-1)} + \mathrm{Neg}(W^{(k)})\,\overline{a}^{(k-1)} + b^{(k)} \ \overline{z}^{(k)} &= \mathrm{ReLU}(W^{(k)})\,\overline{a}^{(k-1)} + \mathrm{Neg}(W^{(k)})\,\underline{a}^{(k-1)} + b^{(k)} \end{aligned}$

with $\mathrm{ReLU}(W)_{ij} = \max\{W_{ij},0\}$ and $\mathrm{Neg}(W)_{ij} = \min\{W_{ij},0\}$ .

Activation Layers: For any monotonic activation $\sigma$ ,

$\underline{a}^{(k)} = \sigma(\underline{z}^{(k)}),\quad \overline{a}^{(k)} = \sigma(\overline{z}^{(k)}).$

This process iterates forward, yielding at the output layer an enclosure for each logit.

2. Certified Robust Training and Optimization Objective

To train provably robust models, IBP is applied within the optimization objective to lower-bound the classification margin

$\omega_i(x) = z^{(m)}_y(x) - z^{(m)}_i(x), \;\; (i \neq y)$

uniformly over all $x$ in the input uncertainty set. The tightest certified lower margin is

$\underline{\omega}^{\mathrm{IBP}}(x_0, \epsilon) = \min_{i\neq y} \left[ \underline{z}^{(m)}_y - \overline{z}^{(m)}_i \right].$

The robust training loss is constructed as a mixture of standard cross-entropy on clean predictions and a penalty on the worst-case margin: $\mathbb{E}_{(x_0, y) \sim \mathcal{D}} \left[ \kappa L(z^{(m)}(x_0), y) + (1 - \kappa) L(-\underline{\omega}^{\mathrm{IBP}}(x_0, \epsilon), y) \right],$ where $\kappa \in [0,1]$ controls the trade-off. Since $\underline{\omega}^{\mathrm{IBP}}$ is differentiable with respect to network parameters, the entire process supports standard backpropagation and stochastic gradient descent (Lyu et al., 2021, Gowal et al., 2018).

Schedule annealing for $\epsilon$ (ramp-up) and $\kappa$ (ramp-down) is essential; early gradual increase in $\epsilon$ and slow mixing of robust loss ( $\kappa$ ) are crucial for convergence and tight final bounds (Gowal et al., 2018, Morawiecki et al., 2019).

3. Comparison with CROWN and Linear Bound Propagation (LBP)

CROWN is a bounding method that uses tight linear (affine) relaxations for nonlinear activations—specifically, it replaces $\sigma(z)$ by parallel upper and lower linear functions over each interval $[l, u]$ : $h^L(z) \leq \sigma(z) \leq h^U(z).$ CROWN then backpropagates the effect of these relaxations via dual variables to lower layers, yielding linear enclosures for network outputs.

Selecting the constant bounding lines $h^L(z) = \sigma(l)$ , $h^U(z) = \sigma(u)$ in CROWN causes it to collapse to the IBP bound. Using the tightest zero-intercept lines through $(l, \sigma(l))$ and $(u, \sigma(u))$ produces strictly tighter relaxations at unstable ReLU neurons (spanning zero). The formal result (Theorem 3.3) is

$[\underline{z}^{(m)}_{\mathrm{IBP}}, \overline{z}^{(m)}_{\mathrm{IBP}}] \supseteq [\underline{z}^{(m)}_{\mathrm{CROWN}}, \overline{z}^{(m)}_{\mathrm{CROWN}}],$

i.e., CROWN's bounds are always tighter or equal to IBP's when using tight bounding lines (Lyu et al., 2021).

LBP interpolates between full CROWN and IBP by truncating back-propagation of linear relaxations to a smaller number of layers—a pure LBP step back-propagates across only one nonlinear layer per bound, retaining $O(m)$ memory and computational cost. LBP's bounds are at least as tight as IBP's (when using tight bounding lines), with strictly lower verified error in empirical studies (Lyu et al., 2021).

4. Activation Function Impact: The ParamRamp Mechanism

IBP-trained ReLU networks often exhibit a preponderance of “dead” neurons—neurons whose pre-activations become entirely inactive ( $\leq 0$ ) or robustly active ( $\geq r$ ), reducing representational capacity and certified performance. ParamRamp generalizes LeakyReLU with a learnable ramp point $r > 0$ : $\mathrm{ParamRamp}(z; r, \alpha) = \begin{cases} \alpha z, & z \leq 0 \ z, & 0 < z \leq r \ r + \beta(z - r),& z > r \end{cases}$ where $\alpha$ , $\beta \in [0,1)$ . The presence of three regimes enables more diverse neuron “statuses” under interval certification. Application within IBP or LBP requires segment-wise linear bounding on $[-\infty, 0]$ , $[0, r]$ , and $[r, \infty]$ (see Appendix A.7 in (Lyu et al., 2021)).

Empirical results demonstrate that ParamRamp improves both verified and adversarial robustness: for example, on MNIST ( $\epsilon=0.4$ ), IBP-verified error reduces from $\approx 14.8\%$ (ReLU) to $\approx 10.9\%$ (ParamRamp), and PGD error from $\approx 11.1\%$ to $6.6\%$ . On CIFAR-10 ( $\epsilon=2/255$ and $8/255$), the improvements range up to $7$–$8$ points in verified error. On Tiny-ImageNet ( $\epsilon=1/255$ ), a WideResNet with ParamRamp achieves an IBP-verified error of $82.94\%$ , reported as the best to date (Lyu et al., 2021).

5. Empirical Analysis and Performance Characteristics

Extensive experiments on MNIST, CIFAR-10, and Tiny-ImageNet show:

LBP, and especially CROWN-LBP, provide tighter verification bounds (lower verified error) than plain IBP in both evaluation and post-hoc certification with negligible computational overhead increase.
Networks trained with ParamRamp activation systematically outperform ReLU networks on verified robustness, with the largest gains on more challenging datasets and perturbation regimes.
On Tiny-ImageNet ( $\epsilon=1/255$ , WideResNet), the best-reported IBP-verified error is $82.94\%$ (Lyu et al., 2021).
The combination of IBP-trained models (with ParamRamp) and LBP verification achieves the lowest known verified errors across multiple datasets and domain perturbations.

The following table summarizes key robust error rates:

Dataset	Method	Activation	$\epsilon$	IBP-verified error (%)	PGD Error (%)
MNIST	IBP	ReLU	0.4	14.8	11.1
MNIST	IBP	ParamRamp	0.4	10.9	6.6
CIFAR-10	IBP	ReLU	2/255–8/255	–	–
CIFAR-10	IBP+LBP+CROWN+ParamRamp	ParamRamp	2/255–8/255	improvement $7$–$8$	improvement
Tiny-ImageNet	IBP+ParamRamp (WideResNet)	ParamRamp	1/255	82.94	–

All methods retain minimal additional training or verification overhead compared to canonical IBP (Lyu et al., 2021).

6. Theoretical and Practical Implications

This line of research establishes:

The existence of a strict performance hierarchy among IBP, (CROWN-)LBP, and CROWN, with IBP always looser unless CROWN adopts constant bounding lines.
The practical feasibility of scaling tight, linear-relaxation-inspired verification (LBP) to large-scale, high-dimensional problems at nearly the same cost as IBP, enabling routine post-hoc certification and improved certified accuracy.
Activation choice, specifically ParamRamp, substantially improves bound tightness and expressivity of certified models, mitigating the collapse into dead neuron regimes endemic to IBP-ReLU training.
The empirical finding that verified robust error can be substantially reduced (by $4$–$8$ percentage points) by integrated architectural and bound propagation advances without requiring new optimizer infrastructure.

In summary, Interval Bound Propagation is a foundational technique for training certifiably robust neural networks. When combined with advances in linear bound propagation and suitable activations like ParamRamp, it supports both improved robustness certificates and scalability to deep, expressive architectures, while retaining the computational efficiency and analytic tractability essential for practical robust ML deployments (Lyu et al., 2021).

PDF Markdown Chat (Pro)

References (3)

Towards Evaluating and Training Verifiably Robust Neural Networks (2021)

On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models (2018)

Fast and Stable Interval Bounds Propagation for Training Verifiably Robust Models (2019)

Follow Topic

Get notified by email when new papers are published related to Interval Bound Propagation (IBP).