Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Counter-Samples: A Stateless Strategy to Neutralize Black Box Adversarial Attacks (2403.10562v1)

Published 14 Mar 2024 in cs.CR, cs.AI, and cs.LG

Abstract: Our paper presents a novel defence against black box attacks, where attackers use the victim model as an oracle to craft their adversarial examples. Unlike traditional preprocessing defences that rely on sanitizing input samples, our stateless strategy counters the attack process itself. For every query we evaluate a counter-sample instead, where the counter-sample is the original sample optimized against the attacker's objective. By countering every black box query with a targeted white box optimization, our strategy effectively introduces an asymmetry to the game to the defender's advantage. This defence not only effectively misleads the attacker's search for an adversarial example, it also preserves the model's accuracy on legitimate inputs and is generic to multiple types of attacks. We demonstrate that our approach is remarkably effective against state-of-the-art black box attacks and outperforms existing defences for both the CIFAR-10 and ImageNet datasets. Additionally, we also show that the proposed defence is robust against strong adversaries as well.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (32)
  1. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013)
  2. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014)
  3. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083 (2017)
  4. Towards evaluating the robustness of neural networks. In: 2017 ieee symposium on security and privacy (sp), Ieee (2017) 39–57
  5. signsgd via zeroth-order oracle. In: International conference on learning representations, International Conference on Learning Representations, ICLR (2019)
  6. Random noise defense against query-based black-box attacks. Advances in Neural Information Processing Systems 34 (2021) 7650–7663
  7. On the effectiveness of small input noise for defending against query-based black-box attacks. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. (2022) 3051–3060
  8. A study of the effect of jpg compression on adversarial images. arXiv preprint arXiv:1608.00853 (2016)
  9. Countering adversarial images using input transformations. arXiv preprint arXiv:1711.00117 (2017)
  10. Feature squeezing: Detecting adversarial examples in deep neural networks. arXiv preprint arXiv:1704.01155 (2017)
  11. Back in black: A comparative evaluation of recent state-of-the-art black-box attacks. IEEE Access 10 (2021) 998–1019
  12. Stateful detection of black-box adversarial attacks. In: Proceedings of the 1st ACM Workshop on Security and Privacy on Artificial Intelligence. (2020) 30–39
  13. Magnet: a two-pronged defense against adversarial examples. In: Proceedings of the 2017 ACM SIGSAC conference on computer and communications security. (2017) 135–147
  14. Adversarial machine learning in image classification: A survey toward the defender’s perspective. ACM Computing Surveys (CSUR) 55(1) (2021) 1–38
  15. Pixeldefend: Leveraging generative models to understand and defend against adversarial examples. arXiv preprint arXiv:1710.10766 (2017)
  16. Defense-gan: Protecting classifiers against adversarial attacks using generative models. arXiv preprint arXiv:1805.06605 (2018)
  17. Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In: International conference on machine learning, PMLR (2018) 274–283
  18. Learning black-box attackers with transferable priors and query feedback. Advances in Neural Information Processing Systems 33 (2020) 12288–12299
  19. Simple black-box adversarial attacks. In: International Conference on Machine Learning, PMLR (2019) 2484–2493
  20. Attacking adversarial attacks as a defense. arXiv preprint arXiv:2106.04938 (2021)
  21. Hopskipjumpattack: A query-efficient decision-based attack. In: 2020 ieee symposium on security and privacy (sp), IEEE (2020) 1277–1294
  22. Geoda: a geometric framework for black-box adversarial attacks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. (2020) 8446–8455
  23. Sign-opt: A query-efficient hard-label adversarial attack. arXiv preprint arXiv:1909.10773 (2019)
  24. Learning multiple layers of features from tiny images. (2009)
  25. Imagenet: A large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, Ieee (2009) 248–255
  26. Parsimonious black-box adversarial attacks via efficient combinatorial optimization. In: International conference on machine learning, PMLR (2019) 4636–4645
  27. Prior convictions: Black-box adversarial attacks with bandits and priors. arXiv preprint arXiv:1807.07978 (2018)
  28. Square attack: a query-efficient black-box adversarial attack via random search. In: European conference on computer vision, Springer (2020) 484–501
  29. Sign bits are all you need for black-box attacks. In: International Conference on Learning Representations. (2019)
  30. Black-box adversarial attacks with limited queries and information. In: International conference on machine learning, PMLR (2018) 2137–2146
  31. On adaptive attacks to adversarial example defenses. Advances in neural information processing systems 33 (2020) 1633–1645
  32. Synthesizing robust adversarial examples. In: International conference on machine learning, PMLR (2018) 284–293

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com