- The paper introduces a scalable training method using Interval Bound Propagation that requires only two additional passes to obtain tight bounds on adversarial robustness.
- It employs a novel training curriculum that dynamically adjusts perturbation size and robustness trade-offs, yielding competitive empirical and verified accuracy.
- Empirical evaluations on MNIST, CIFAR-10, SVHN, and downscaled ImageNet demonstrate improved verified error rates and practical scalability for robust image classification.
Scalable Verified Training for Provably Robust Image Classification
The paper by Sven Gowal et al. presents a methodology to train deep neural networks that are provably robust to norm-bounded adversarial perturbations. The research addresses the challenge of scaling verified training methods to large neural networks while maintaining state-of-the-art verified accuracy, particularly using a technique known as Interval Bound Propagation (IBP).
Summary of Contributions
- Interval Bound Propagation (IBP): The paper emphasizes using IBP, a fast method derived from interval arithmetic, to train verifiably robust classifiers. Unlike other complex techniques, IBP requires only two additional forward passes through the network, allowing it to scale effectively.
- Training Strategy: The authors employ a unique training curriculum that adjusts ϵ, the perturbation size, and κ, a parameter balancing the fit to data versus robustness during training. This aids in achieving tight bounds, ensuring both nominal and adversarial robustness.
- Empirical and Verified Evaluation: The research demonstrates competitive empirical adversarial accuracy against prevalent methods like those by Madry and Wong et al., while achieving superior verified accuracy. This is shown through extensive experiments on datasets like MNIST, CIFAR-10, SVHN, and even a downscaled version of ImageNet, the latter being verified beyond vacuous bounds.
- Theoretical and Practical Implications: The results challenge the assumption that tighter relaxations are always necessary for stronger verifiable guarantees. IBP's simplicity and computational efficiency make it a viable choice for broader applications.
- Code Availability: The paper ensures reproducibility by providing the code for training robust models with IBP.
Results and Comparisons
The research establishes benchmarks in verifiable robustness across several datasets:
- MNIST: Verified error rates significantly improve to 2.23% at ϵ=0.1, outperforming other methods in both empirical adversarial robustness and verified bounds.
- CIFAR-10 and SVHN: The paper reports competitive adversarial robustness and sets new state-of-the-art verified accuracies.
- ImageNet: The paper reports training of a provably robust model on a downscaled version of ImageNet, achieving a non-vacuous verified error rate of 93.87% at ϵ=1/255, marking an important milestone in the scalability of verifiable models.
Implications for Future Work
The paper paves the way for developing scalable verification tools in practical applications where robustness is critical. Moreover, the approach fosters exploration into hybrid methods that combine efficient yet simple bounding techniques like IBP with more sophisticated relaxations to tackle deeper and wider models effectively.
Conclusion
By leveraging IBP within a well-crafted training paradigm, the researchers demonstrate that it is possible to train large-scale neural networks with rigorous robustness guarantees efficiently. This advancement offers a practical approach towards integrating verifiable robustness into machine learning systems, potentially influencing future work in adversarial machine learning and verification research.