Adversarial Robustness Certification for Bayesian Neural Networks (2306.13614v1)
Abstract: We study the problem of certifying the robustness of Bayesian neural networks (BNNs) to adversarial input perturbations. Given a compact set of input points $T \subseteq \mathbb{R}m$ and a set of output points $S \subseteq \mathbb{R}n$, we define two notions of robustness for BNNs in an adversarial setting: probabilistic robustness and decision robustness. Probabilistic robustness is the probability that for all points in $T$ the output of a BNN sampled from the posterior is in $S$. On the other hand, decision robustness considers the optimal decision of a BNN and checks if for all points in $T$ the optimal decision of the BNN for a given loss function lies within the output set $S$. Although exact computation of these robustness properties is challenging due to the probabilistic and non-convex nature of BNNs, we present a unified computational framework for efficiently and formally bounding them. Our approach is based on weight interval sampling, integration, and bound propagation techniques, and can be applied to BNNs with a large number of parameters, and independently of the (approximate) inference method employed to train the BNN. We evaluate the effectiveness of our methods on various regression and classification tasks, including an industrial regression benchmark, MNIST, traffic sign recognition, and airborne collision avoidance, and demonstrate that our approach enables certification of robustness and uncertainty of BNN predictions.
- R. Aggarwal, V. Sounderajah, G. Martin, D. S. Ting, A. Karthikesalingam, D. King, H. Ashrafian, and A. Darzi, “Diagnostic accuracy of deep learning in medical imaging: A systematic review and meta-analysis,” NPJ digital medicine, vol. 4, no. 1, pp. 1–23, 2021.
- L. Chen, S. Lin, X. Lu, D. Cao, H. Wu, C. Guo, C. Liu, and F.-Y. Wang, “Deep neural network based vehicle and pedestrian detection for autonomous driving: a survey,” IEEE Transactions on Intelligent Transportation Systems, vol. 22, no. 6, pp. 3234–3246, 2021.
- C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus, “Intriguing properties of neural networks,” ICLR, 2014.
- B. Biggio and F. Roli, “Wild patterns: Ten years after the rise of adversarial machine learning,” Pattern Recognition, vol. 84, pp. 317–331, 2018.
- S. Adams, M. Lahijanian, and L. Laurenti, “Formal control synthesis for stochastic neural network dynamic models,” IEEE Control Systems Letters, vol. 6, pp. 2858–2863, 2022.
- T. Wei and C. Liu, “Safe control with neural network dynamic models,” in Learning for Dynamics and Control Conference. PMLR, 2022, pp. 739–750.
- A. Kendall and Y. Gal, “What uncertainties do we need in Bayesian deep learning for computer vision?” in NeurIPS, 2017.
- R. Michelmore, M. Wicker, L. Laurenti, L. Cardelli, Y. Gal, and M. Kwiatkowska, “Uncertainty quantification with statistical guarantees in end-to-end autonomous driving control,” ICRA, 2019.
- A. Bekasov and I. Murray, “Bayesian adversarial spheres: Bayesian inference and adversarial examples in a noiseless setting,” arXiv preprint arXiv:1811.12335, 2018.
- G. Carbone, M. Wicker, L. Laurenti, A. Patane, L. Bortolussi, and G. Sanguinetti, “Robustness of bayesian neural networks to gradient-based attacks,” Advances in Neural Information Processing Systems, vol. 33, pp. 15 602–15 613, 2020.
- M. Yuan, M. Wicker, and L. Laurenti, “Gradient-free adversarial attacks for bayesian neural networks,” arXiv preprint arXiv:2012.12640, 2020.
- X. Liu, Y. Li, C. Wu, and C.-J. Hsieh, “Adv-bnn: Improved adversarial defense through robust Bayesian neural network,” ICLR, 2019.
- L. Cardelli, M. Kwiatkowska, L. Laurenti, N. Paoletti, A. Patane, and M. Wicker, “Statistical guarantees for the robustness of Bayesian neural networks,” IJCAI, 2019.
- L. Smith and Y. Gal, “Understanding measures of uncertainty for adversarial example detection,” in UAI, 2018.
- I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” arXiv preprint arXiv:1412.6572, 2014.
- A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards Deep Learning Models Resistant to Adversarial Attacks,” arXiv e-prints, Jun. 2017.
- S. Gowal, K. Dvijotham, R. Stanforth, R. Bunel, C. Qin, J. Uesato, R. Arandjelovic, T. Mann, and P. Kohli, “On the effectiveness of interval bound propagation for training verifiably robust models,” SecML 2018, 2018.
- K. Dvijotham, R. Stanforth, S. Gowal, T. A. Mann, and P. Kohli, “A dual approach to scalable verification of deep networks.” in UAI, vol. 1, no. 2, 2018, p. 3.
- E. Benussi, A. Patane, M. Wicker, L. Laurenti, and M. Kwiatkowska, “Individual fairness guarantees for neural networks,” arXiv preprint arXiv:2205.05763, 2022.
- M. Wicker, L. Laurenti, A. Patane, and M. Kwiatkowska, “Probabilistic safety for bayesian neural networks,” in Conference on uncertainty in artificial intelligence. PMLR, 2020, pp. 1198–1207.
- L. Berrada, S. Dathathri, K. Dvijotham, R. Stanforth, R. R. Bunel, J. Uesato, S. Gowal, and M. P. Kumar, “Make sure you’re unsure: A framework for verifying probabilistic specifications,” Advances in Neural Information Processing Systems, vol. 34, 2021.
- T. Gehr, M. Mirman, D. Drachsler-Cohen, P. Tsankov, S. Chaudhuri, and M. Vechev, “Ai2: Safety and robustness certification of neural networks with abstract interpretation,” in 2018 IEEE S&P. IEEE, 2018, pp. 3–18.
- K. D. Julian and M. J. Kochenderfer, “Guaranteeing safety for neural network-based aircraft collision avoidance systems,” DASC, 2019.
- D. Dua and C. Graff, “UCI machine learning repository,” 2017. [Online]. Available: http://archive.ics.uci.edu/ml
- J. Stallkamp, M. Schlipsing, J. Salmen, and C. Igel, “Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition,” Neural networks, vol. 32, pp. 323–332, 2012.
- M. Wicker, L. Laurenti, A. Patane, Z. Chen, Z. Zhang, and M. Kwiatkowska, “Bayesian inference with certifiable adversarial robustness,” in International Conference on Artificial Intelligence and Statistics. PMLR, 2021, pp. 2431–2439.
- A. Rawat, M. Wistuba, and M.-I. Nicolae, “Adversarial phenomenon in the eyes of Bayesian deep learning,” arXiv preprint arXiv:1711.08244, 2017.
- N. Carlini and D. Wagner, “Towards Evaluating the Robustness of Neural Networks,” arXiv e-prints, p. arXiv:1608.04644, Aug 2016.
- L. Berrada, S. Dathathri, K. Dvijotham, R. Stanforth, R. R. Bunel, J. Uesato, S. Gowal, and M. P. Kumar, “Make sure you’re unsure: A framework for verifying probabilistic specifications,” Advances in Neural Information Processing Systems, vol. 34, pp. 11 136–11 147, 2021.
- M. Wicker, L. Laurenti, A. Patane, N. Paoletti, A. Abate, and M. Kwiatkowska, “Certification of iterative predictions in bayesian neural networks,” in Uncertainty in Artificial Intelligence. PMLR, 2021, pp. 1713–1723.
- M. Lechner, D. Žikelić, K. Chatterjee, and T. Henzinger, “Infinite time horizon safety of bayesian neural networks,” Advances in Neural Information Processing Systems, vol. 34, pp. 10 171–10 185, 2021.
- V. Tjeng, K. Xiao, and R. Tedrake, “Evaluating robustness of neural networks with mixed integer programming,” arXiv preprint arXiv:1711.07356, 2017.
- A. Raghunathan, J. Steinhardt, and P. S. Liang, “Semidefinite relaxations for certifying robustness to adversarial examples,” Advances in Neural Information Processing Systems, vol. 31, 2018.
- K. Dvijotham, M. Garnelo, A. Fawzi, and P. Kohli, “Verification of deep probabilistic models,” arXiv preprint arXiv:1812.02795, 2018.
- E. Wong and Z. Kolter, “Provable defenses against adversarial examples via the convex outer adversarial polytope,” in International Conference on Machine Learning. PMLR, 2018, pp. 5286–5295.
- M. Wicker, X. Huang, and M. Kwiatkowska, “Feature-guided black-box safety testing of deep neural networks,” in TACAS. Springer, 2018, pp. 408–426.
- M. Wu, M. Wicker, W. Ruan, X. Huang, and M. Kwiatkowska, “A game-based approximate verification of deep neural networks with provable guarantees,” Theoretical Computer Science, vol. 807, pp. 298–329, 2020.
- G. Katz, C. Barrett, D. L. Dill, K. Julian, and M. J. Kochenderfer, “Reluplex: An efficient SMT solver for verifying deep neural networks,” in CAV, 2017.
- T.-W. Weng, H. Zhang, H. Chen, Z. Song, C.-J. Hsieh, D. Boning, I. S. Dhillon, and L. Daniel, “Towards fast computation of certified robustness for relu networks,” ICML, 2018.
- H. Zhang, T.-W. Weng, P.-Y. Chen, C.-J. Hsieh, and L. Daniel, “Efficient neural network robustness certification with general activation functions,” in NeurIPS, 2018, pp. 4939–4948.
- L. Cardelli, M. Kwiatkowska, L. Laurenti, and A. Patane, “Robustness guarantees for Bayesian inference with Gaussian processes,” in AAAI, 2018.
- M. T. Smith, K. Grosse, M. Backes, and M. A. Alvarez, “Adversarial vulnerability bounds for Gaussian process classification,” arXiv preprint arXiv:1909.08864, 2019.
- A. Patane, A. Blaas, L. Laurenti, L. Cardelli, S. Roberts, and M. Kwiatkowska, “Adversarial robustness guarantees for gaussian processes,” Journal of Machine Learning Research, vol. 23, 2022.
- C. Blundell, J. Cornebise, K. Kavukcuoglu, and D. Wierstra, “Weight uncertainty in neural networks,” ICML, 2015.
- R. Stanforth, S. Gowal, T. Mann, P. Kohli et al., “A dual approach to scalable verification of deep networks,” arXiv preprint arXiv:1803.06567, 2018.
- G. De Palma, B. Kiani, and S. Lloyd, “Adversarial robustness guarantees for random deep neural networks,” in International Conference on Machine Learning. PMLR, 2021, pp. 2522–2534.
- S.-H. Chang, P. C. Cosman, and L. B. Milstein, “Chernoff-type bounds for the gaussian error function,” IEEE Transactions on Communications, vol. 59, no. 11, pp. 2939–2944, 2011.
- M. Khan, D. Nielsen, V. Tangkaratt, W. Lin, Y. Gal, and A. Srivastava, “Fast and scalable bayesian deep learning by weight-perturbation in adam,” in International Conference on Machine Learning. PMLR, 2018, pp. 2611–2620.
- G. P. McCormick, “Computability of global solutions to factorable nonconvex programs: Part I convex underestimating problems,” Mathematical programming, pp. 147–175, 1976.
- Y. LeCun, “The mnist database of handwritten digits,” http://yann. lecun. com/exdb/mnist/, 1998.
- J. M. Hernández-Lobato and R. Adams, “Probabilistic backpropagation for scalable learning of Bayesian neural networks,” in International Conference on Machine Learning, 2015, pp. 1861–1869.
- Y. Gal and Z. Ghahramani, “Dropout as a Bayesian approximation: Representing model uncertainty in deep learning,” in ICML, 2016, pp. 1050–1059.
- F. M. Shakiba, M. Shojaee, S. M. Azizi, and M. Zhou, “Robustness analysis of generalized regression neural network-based fault diagnosis for transmission lines,” in 2022 IEEE International Conference on Systems, Man, and Cybernetics (SMC). IEEE, 2022, pp. 131–136.
- K. Osawa, S. Swaroop, M. E. E. Khan, A. Jain, R. Eschenhagen, R. E. Turner, and R. Yokota, “Practical deep learning with bayesian principles,” Advances in neural information processing systems, vol. 32, 2019.
- P. Izmailov, S. Vikram, M. D. Hoffman, and A. G. G. Wilson, “What are bayesian neural network posteriors really like?” in International conference on machine learning. PMLR, 2021, pp. 4629–4640.
- Y. Gal, “Uncertainty in deep learning,” Ph.D. dissertation, University of Cambridge, 2016.
- D. A. Nix and A. S. Weigend, “Estimating the mean and variance of the target probability distribution,” in Proceedings of 1994 ieee international conference on neural networks (ICNN’94), vol. 1. IEEE, 1994, pp. 55–60.
- C. Bonferroni, “Teoria statistica delle classi e calcolo delle probabilita,” Pubblicazioni del R Istituto Superiore di Scienze Economiche e Commericiali di Firenze, vol. 8, pp. 3–62, 1936.