Training for Faster Adversarial Robustness Verification via Inducing ReLU Stability (1809.03008v3)

Published 9 Sep 2018 in cs.LG, cs.CR, cs.NE, and stat.ML

Abstract: We explore the concept of co-design in the context of neural network verification. Specifically, we aim to train deep neural networks that not only are robust to adversarial perturbations but also whose robustness can be verified more easily. To this end, we identify two properties of network models - weight sparsity and so-called ReLU stability - that turn out to significantly impact the complexity of the corresponding verification task. We demonstrate that improving weight sparsity alone already enables us to turn computationally intractable verification problems into tractable ones. Then, improving ReLU stability leads to an additional 4-13x speedup in verification times. An important feature of our methodology is its "universality," in the sense that it can be used with a broad range of training procedures and verification approaches.

Citations (194)

View on Semantic Scholar

Summary

Training for Faster Adversarial Robustness Verification via Inducing ReLU Stability

The paper "Training for Faster Adversarial Robustness Verification via Inducing ReLU Stability" addresses a critical aspect of neural network deployment in safety-critical environments: adversarial robustness verification. The authors propose a co-design approach that integrates neural network training with the goal of expediting the verification process, significantly reducing the computational complexity by enhancing two key network properties—weight sparsity and ReLU stability.

Main Contributions

The crux of the paper lies in shifting the paradigm from treating training and verification as separate processes toward a unified approach that considers both as intertwined. This is particularly relevant for models operating in adversarial settings, where robustness against perturbed inputs must be both resilient and quickly verifiable. The contributions are twofold:

Weight Sparsity: By employing regularization methods like $\ell_1$ -regularization and small weight pruning, networks are trained to be weight-sparse. Such sparsity translates into a reduced number of variables in verification formulations, leveraging the efficiency of LP/MILP solvers. Experiments show that these techniques alone turn intractable verification scenarios into tractable ones.
ReLU Stability: The paper introduces a novel regularization term, RS Loss, which promotes ReLU stability—a measure of how often ReLUs remain consistently active/inactive across input perturbations. By reducing the necessity for "branching" during verification (which adds complexity by considering separate cases for each ReLU), the authors achieve substantial speed improvements, demonstrating verification time reductions up to 13x for MNIST models.

Numerical Results

The methodology shows promising numerical results on MNIST and CIFAR-10 datasets. For MNIST, with $\epsilon=0.1$ , provable adversarial accuracy reaches 94.33%, with average verification times of 0.49 seconds, significantly outperforming previous methods. CIFAR-10 results are more challenging, with provable adversarial accuracies up to 20.27% for $\epsilon=8/255$ , revealing potential constraints when scaling to more complex datasets and architectures.

Implications and Future Work

This work stands out by enriching robustness verification strategies with universally applicable training techniques—improvements in weight sparsity and ReLU stability are broadly compatible with existing training procedures, offering vital efficiency gains irrespective of the verifier used. The integration of these methods can influence how models are prepared for deployment in adversarial contexts, providing not just empirical robustness but formal assurance to stakeholders.

Future developments in AI could further benefit from these techniques by extending their applicability beyond adversarial robustness to broader properties requiring formal verification. Moreover, exploring additional network properties that might influence verification speed, alongside potential symbiotic relationships with model compression, could yield richer, more scalable verification solutions.

In conclusion, by aligning training methodologies with verification requirements, the authors provide a robust framework that enhances adversarial robustness verification efficiency, contributing a notable advancement toward reliable DNN deployment in real-world applications.