Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

On the Robustness of Bayesian Neural Networks to Adversarial Attacks (2207.06154v3)

Published 13 Jul 2022 in cs.LG, cs.AI, and cs.CR

Abstract: Vulnerability to adversarial attacks is one of the principal hurdles to the adoption of deep learning in safety-critical applications. Despite significant efforts, both practical and theoretical, training deep learning models robust to adversarial attacks is still an open problem. In this paper, we analyse the geometry of adversarial attacks in the large-data, overparameterized limit for Bayesian Neural Networks (BNNs). We show that, in the limit, vulnerability to gradient-based attacks arises as a result of degeneracy in the data distribution, i.e., when the data lies on a lower-dimensional submanifold of the ambient space. As a direct consequence, we demonstrate that in this limit BNN posteriors are robust to gradient-based adversarial attacks. Crucially, we prove that the expected gradient of the loss with respect to the BNN posterior distribution is vanishing, even when each neural network sampled from the posterior is vulnerable to gradient-based attacks. Experimental results on the MNIST, Fashion MNIST, and half moons datasets, representing the finite data regime, with BNNs trained with Hamiltonian Monte Carlo and Variational Inference, support this line of arguments, showing that BNNs can display both high accuracy on clean data and robustness to both gradient-based and gradient-free based adversarial attacks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (65)
  1. C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus, “Intriguing properties of neural networks,” arXiv preprint arXiv:1312.6199, 2013.
  2. I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” arXiv preprint arXiv:1412.6572, 2014.
  3. J. Zhang and C. Li, “Adversarial examples: Opportunities and challenges,” IEEE transactions on neural networks and learning systems, vol. 31, no. 7, pp. 2578–2593, 2019.
  4. B. Biggio and F. Roli, “Wild patterns: Ten years after the rise of adversarial machine learning,” Pattern Recognition, vol. 84, pp. 317–331, 2018.
  5. Y. Zhuo, Z. Song, and Z. Ge, “Security versus accuracy: Trade-off data modeling to safe fault classification systems,” IEEE Transactions on Neural Networks and Learning Systems, 2023.
  6. A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards deep learning models resistant to adversarial attacks,” 2017.
  7. X. Jia, H. Gu, Y. Liu, J. Yang, X. Wang, W. Pan, Y. Zhang, S. Cotofana, and W. Zhao, “An energy-efficient bayesian neural network implementation using stochastic computing method,” IEEE Transactions on Neural Networks and Learning Systems, 2023.
  8. H. Li, P. Barnaghi, S. Enshaeifar, and F. Ganz, “Continual learning using bayesian neural networks,” IEEE transactions on neural networks and learning systems, vol. 32, no. 9, pp. 4243–4252, 2020.
  9. J.-T. Chien and Y.-C. Ku, “Bayesian recurrent neural network for language modeling,” IEEE transactions on neural networks and learning systems, vol. 27, no. 2, pp. 361–374, 2015.
  10. R. Feinman, R. R. Curtin, S. Shintre, and A. B. Gardner, “Detecting adversarial samples from artifacts,” arXiv preprint arXiv:1703.00410, 2017.
  11. M. Wicker, L. Laurenti, A. Patane, Z. Chen, Z. Zhang, and M. Kwiatkowska, “Bayesian inference with certifiable adversarial robustness,” in International Conference on Artificial Intelligence and Statistics.   PMLR, 2021, pp. 2431–2439.
  12. A. Bekasov and I. Murray, “Bayesian adversarial spheres: Bayesian inference and adversarial examples in a noiseless setting,” arXiv preprint arXiv:1811.12335, 2018.
  13. X. Liu, Y. Li, C. Wu, and C.-J. Hsieh, “Adv-bnn: Improved adversarial defense through robust bayesian neural network,” arXiv preprint arXiv:1810.01279, 2018.
  14. M. Yuan, M. Wicker, and L. Laurenti, “Gradient-free adversarial attacks for bayesian neural networks,” arXiv preprint arXiv:2012.12640, 2020.
  15. J. M. Lee, “Smooth manifolds,” in Introduction to smooth manifolds.   Springer, 2013, pp. 1–31.
  16. C. Blundell, J. Cornebise, K. Kavukcuoglu, and D. Wierstra, “Weight uncertainty in neural networks,” arXiv preprint arXiv:1505.05424, 2015.
  17. G. Carbone, M. Wicker, L. Laurenti, A. Patane, L. Bortolussi, and G. Sanguinetti, “Robustness of bayesian neural networks to gradient-based attacks,” Advances in Neural Information Processing Systems, vol. 33, pp. 15 602–15 613, 2020.
  18. C. Liu, M. Salzmann, and S. Süsstrunk, “Training provably robust models by polyhedral envelope regularization,” IEEE Transactions on Neural Networks and Learning Systems, 2021.
  19. R. R. Wiyatno, A. Xu, O. Dia, and A. De Berker, “Adversarial examples in modern machine learning: A review,” arXiv preprint arXiv:1911.05268, 2019.
  20. X. Yuan, P. He, Q. Zhu, and X. Li, “Adversarial examples: Attacks and defenses for deep learning,” IEEE transactions on neural networks and learning systems, vol. 30, no. 9, pp. 2805–2824, 2019.
  21. A.-J. Gallego, J. Calvo-Zaragoza, and R. B. Fisher, “Incremental unsupervised domain-adversarial training of neural networks,” IEEE Transactions on Neural Networks and Learning Systems, vol. 32, no. 11, pp. 4864–4878, 2020.
  22. P. Panda, “Quanos: adversarial noise sensitivity driven hybrid quantization of neural networks,” in Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design, 2020, pp. 187–192.
  23. A. Ilyas, S. Santurkar, D. Tsipras, L. Engstrom, B. Tran, and A. Madry, “Adversarial examples are not bugs, they are features,” Advances in neural information processing systems, vol. 32, 2019.
  24. J. Lin, C. Gan, and S. Han, “Defensive quantization: When efficiency meets robustness,” in International Conference on Learning Representations.   International Conference on Learning Representations, ICLR, 2019.
  25. S. Wang, S. Chen, T. Chen, S. Nepal, C. Rudolph, and M. Grobler, “Generating semantic adversarial examples via feature manipulation in latent space,” IEEE Transactions on Neural Networks and Learning Systems, 2023.
  26. D. Tsipras, S. Santurkar, L. Engstrom, A. Turner, and A. Madry, “Robustness may be at odds with accuracy,” arXiv preprint arXiv:1805.12152, 2018.
  27. Y. Pang, S. Cheng, J. Hu, and Y. Liu, “Evaluating the robustness of bayesian neural networks against different types of attacks,” arXiv preprint arXiv:2106.09223, 2021.
  28. L. Smith and Y. Gal, “Understanding measures of uncertainty for adversarial example detection,” arXiv preprint arXiv:1803.08533, 2018.
  29. A. Uchendu, D. Campoy, C. Menart, and A. Hildenbrandt, “Robustness of bayesian neural networks to white-box adversarial attacks,” in 2021 IEEE Fourth International Conference on Artificial Intelligence and Knowledge Engineering (AIKE).   IEEE, 2021, pp. 72–80.
  30. R. Michelmore, M. Wicker, L. Laurenti, L. Cardelli, Y. Gal, and M. Kwiatkowska, “Uncertainty quantification with statistical guarantees in end-to-end autonomous driving control,” in 2020 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2020, pp. 7344–7350.
  31. Y. Gal and L. Smith, “Sufficient conditions for idealised models to have no adversarial examples: a theoretical and empirical study with bayesian neural networks,” arXiv preprint arXiv:1806.00667, 2018.
  32. A. Rawat, M. Wistuba, and M.-I. Nicolae, “Adversarial phenomenon in the eyes of bayesian deep learning,” arXiv preprint arXiv:1711.08244, 2017.
  33. N. Carlini and D. Wagner, “Adversarial examples are not easily detected: Bypassing ten detection methods,” in Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, 2017, pp. 3–14.
  34. K. Grosse, D. Pfaff, M. T. Smith, and M. Backes, “The limitations of model uncertainty in adversarial settings,” arXiv preprint arXiv:1812.02606, 2018.
  35. M. Wicker, L. Laurenti, A. Patane, and M. Kwiatkowska, “Probabilistic safety for bayesian neural networks,” in Conference on Uncertainty in Artificial Intelligence.   PMLR, 2020, pp. 1198–1207.
  36. L. Berrada, S. Dathathri, R. Stanforth, R. Bunel, J. Uesato, S. Gowal, M. P. Kumar et al., “Verifying probabilistic specifications with functional lagrangians,” arXiv preprint arXiv:2102.09479, 2021.
  37. J. Zhang, Y. Hua, Z. Xue, T. Song, C. Zheng, R. Ma, and H. Guan, “Robust bayesian neural networks by spectral expectation bound regularization,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 3815–3824.
  38. N. Ye and Z. Zhu, “Bayesian adversarial learning,” in Proceedings of the 32nd International Conference on Neural Information Processing Systems.   Curran Associates Inc., 2018, pp. 6892–6901.
  39. G. Cybenko, “Approximation by superpositions of a sigmoidal function,” Mathematics of control, signals and systems, vol. 2, no. 4, pp. 303–314, 1989.
  40. K. Hornik, “Approximation capabilities of multilayer feedforward networks,” Neural networks, vol. 4, no. 2, pp. 251–257, 1991.
  41. A. G. d. G. Matthews, M. Rowland, J. Hron, R. E. Turner, and Z. Ghahramani, “Gaussian process behaviour in wide deep neural networks,” arXiv preprint arXiv:1804.11271, 2018.
  42. G. Ongie, R. Willett, D. Soudry, and N. Srebro, “A function space view of bounded norm infinite width relu nets: The multivariate case,” in International Conference on Learning Representations, 2020.
  43. G. M. Rotskoff and E. Vanden-Eijnden, “Neural networks as interacting particle systems: Asymptotic convexity of the loss landscape and universal scaling of the approximation error,” arXiv preprint arXiv:1805.00915, 2018.
  44. R. M. Neal et al., “Mcmc using hamiltonian dynamics,” Handbook of markov chain monte carlo, vol. 2, no. 11, p. 2, 2011.
  45. J. Lee, Y. Bahri, R. Novak, S. S. Schoenholz, J. Pennington, and J. Sohl-Dickstein, “Deep neural networks as gaussian processes,” arXiv preprint arXiv:1711.00165, 2017.
  46. A. Garriga-Alonso, C. E. Rasmussen, and L. Aitchison, “Deep convolutional networks as shallow gaussian processes,” arXiv preprint arXiv:1808.05587, 2018.
  47. A. Fawzi, H. Fawzi, and O. Fawzi, “Adversarial vulnerability for any classifier,” in Advances in Neural Information Processing Systems, 2018, pp. 1178–1187.
  48. M. Khoury and D. Hadfield-Menell, “On the geometry of adversarial examples,” arXiv preprint arXiv:1811.00525, 2018.
  49. S. Goldt, M. Mézard, F. Krzakala, and L. Zdeborová, “Modelling the influence of data structure on learning in neural networks,” arXiv preprint arXiv:1909.11500, 2019.
  50. C. Anders, P. Pasliev, A.-K. Dombrowski, K.-R. Müller, and P. Kessel, “Fairwashing explanations with off-manifold detergent,” in International Conference on Machine Learning.   PMLR, 2020, pp. 314–323.
  51. H. Liu, Y.-S. Ong, X. Shen, and J. Cai, “When gaussian process meets big data: A review of scalable gps,” IEEE transactions on neural networks and learning systems, vol. 31, no. 11, pp. 4405–4423, 2020.
  52. J. Hron, Y. Bahri, R. Novak, J. Pennington, and J. Sohl-Dickstein, “Exact posterior distributions of wide bayesian neural networks,” arXiv preprint arXiv:2006.10541, 2020.
  53. R. Novak, L. Xiao, Y. Bahri, J. Lee, G. Yang, J. Hron, D. A. Abolafia, J. Pennington, and J. Sohl-dickstein, “Bayesian deep convolutional networks with many channels are gaussian processes,” in International Conference on Learning Representations, 2018.
  54. B.-H. Tran, S. Rossi, D. Milios, and M. Filippone, “All you need is a good functional prior for bayesian deep learning,” Journal of Machine Learning Research, vol. 23, no. 74, pp. 1–56, 2022.
  55. L. Cardelli, M. Kwiatkowska, L. Laurenti, and A. Patane, “Robustness guarantees for bayesian inference with gaussian processes,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, 2019, pp. 7759–7768.
  56. K. Grosse, M. T. Smith, and M. Backes, “Killing four birds with one gaussian process: the relation between different test-time attacks,” in 2020 25th International Conference on Pattern Recognition (ICPR).   IEEE, 2021, pp. 4696–4703.
  57. A. Patane, A. Blaas, L. Laurenti, L. Cardelli, S. Roberts, and M. Kwiatkowska, “Adversarial robustness guarantees for gaussian processes,” Journal of Machine Learning Research, vol. 23, pp. 1–55, 2022.
  58. H. Xiao, K. Rasul, and R. Vollgraf, “Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms,” arXiv preprint arXiv:1708.07747, 2017.
  59. A. Athalye, N. Carlini, and D. Wagner, “Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples,” arXiv preprint arXiv:1802.00420, 2018.
  60. N. Carlini and D. Wagner, “Towards evaluating the robustness of neural networks,” 2016.
  61. D. Su, H. Zhang, H. Chen, J. Yi, P.-Y. Chen, and Y. Gao, “Is robustness the cost of accuracy?–a comprehensive study on the robustness of 18 deep image classification models,” in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 631–648.
  62. L. Cardelli, M. Kwiatkowska, L. Laurenti, N. Paoletti, A. Patane, and M. Wicker, “Statistical guarantees for the robustness of bayesian neural networks,” in IJCAI, 2019.
  63. W. J. Maddox, P. Izmailov, T. Garipov, D. P. Vetrov, and A. G. Wilson, “A simple baseline for bayesian uncertainty in deep learning,” Advances in neural information processing systems, vol. 32, 2019.
  64. G. Zhang, S. Sun, D. Duvenaud, and R. Grosse, “Noisy natural gradient as variational inference,” in International conference on machine learning.   PMLR, 2018, pp. 5852–5861.
  65. P.-Y. Chen, H. Zhang, Y. Sharma, J. Yi, and C.-J. Hsieh, “Zoo: Zeroth order optimization based black-box attacks to deep neural networks without training substitute models,” in Proceedings of the 10th ACM workshop on artificial intelligence and security, 2017, pp. 15–26.
Citations (10)

Summary

We haven't generated a summary for this paper yet.