Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Diffusion Denoising as a Certified Defense against Clean-label Poisoning (2403.11981v1)

Published 18 Mar 2024 in cs.CR, cs.CV, and cs.LG

Abstract: We present a certified defense to clean-label poisoning attacks. These attacks work by injecting a small number of poisoning samples (e.g., 1%) that contain $p$-norm bounded adversarial perturbations into the training data to induce a targeted misclassification of a test-time input. Inspired by the adversarial robustness achieved by $denoised$ $smoothing$, we show how an off-the-shelf diffusion model can sanitize the tampered training data. We extensively test our defense against seven clean-label poisoning attacks and reduce their attack success to 0-16% with only a negligible drop in the test time accuracy. We compare our defense with existing countermeasures against clean-label poisoning, showing that the defense reduces the attack success the most and offers the best model utility. Our results highlight the need for future work on developing stronger clean-label attacks and using our certified yet practical defense as a strong baseline to evaluate these attacks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (46)
  1. Bullseye polytope: A scalable clean-label poisoning attack with improved transferability. In 2021 IEEE European Symposium on Security and Privacy (EuroS&P), pp.  159–178. IEEE, 2021.
  2. On warm-starting neural network training. Advances in neural information processing systems, 33:3884–3894, 2020.
  3. Poisoning attacks against support vector machines. arXiv preprint arXiv:1206.6389, 2012.
  4. Dp-instahide: Provably defusing poisoning and backdoor attacks with differentially private data augmentations. arXiv preprint arXiv:2103.02079, 2021.
  5. Carlini, N. Poisoning the unlabeled dataset of {{\{{Semi-Supervised}}\}} learning. In 30th USENIX Security Symposium (USENIX Security 21), pp. 1577–1592, 2021.
  6. (certified!!) adversarial robustness for free! arXiv preprint arXiv:2206.10550, 2022.
  7. Certified adversarial robustness via randomized smoothing. In international conference on machine learning, pp. 1310–1320. PMLR, 2019.
  8. Sever: A robust meta-algorithm for stochastic optimization. In International Conference on Machine Learning, pp. 1596–1606. PMLR, 2019.
  9. Learning and certification under instance-targeted poisoning. In Uncertainty in Artificial Intelligence, pp.  2135–2145. PMLR, 2021.
  10. What doesn’t kill you makes you robust (er): How to adversarially train against data poisoning. arXiv preprint arXiv:2102.13624, 2021a.
  11. Witches’ brew: Industrial scale data poisoning via gradient matching. In International Conference on Learning Representations, 2021b. URL https://openreview.net/forum?id=01olnfLIbD.
  12. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33:6840–6851, 2020.
  13. On the effectiveness of mitigating data poisoning attacks with gradient shaping. arXiv preprint arXiv:2002.11497, 2020.
  14. Metapoison: Practical general-purpose clean-label data poisoning. Advances in Neural Information Processing Systems, 33:12080–12091, 2020.
  15. Manipulating machine learning: Poisoning attacks and countermeasures for regression learning. In 2018 IEEE symposium on security and privacy (SP), pp. 19–35. IEEE, 2018.
  16. Learning multiple layers of features from tiny images. 2009.
  17. Tiny imagenet visual recognition challenge. CS 231N, 7(7):3, 2015.
  18. Certified robustness to adversarial examples with differential privacy. In 2019 IEEE symposium on security and privacy (SP), pp. 656–672. IEEE, 2019.
  19. Deep partition aggregation: Provable defense against general poisoning attacks. arXiv preprint arXiv:2006.14768, 2020.
  20. Friendly noise against adversarial noise: a powerful defense against data poisoning attack. Advances in Neural Information Processing Systems, 35:11947–11959, 2022.
  21. Randomness in ml defenses helps persistent attackers and hinders evaluators. arXiv preprint arXiv:2302.13464, 2023.
  22. Data poisoning against differentially-private learners: Attacks and defenses. arXiv preprint arXiv:1903.09860, 2019.
  23. Exploiting machine learning to subvert your spam filter. LEET, 8(1-9):16–17, 2008a.
  24. Exploiting machine learning to subvert your spam filter. LEET, 8(1-9):16–17, 2008b.
  25. Improved denoising diffusion probabilistic models. In International Conference on Machine Learning, pp. 8162–8171. PMLR, 2021.
  26. Tempered sigmoid activations for deep learning with differential privacy. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pp.  9312–9321, 2021.
  27. Deep k-nn defense against clean-label data poisoning attacks. In Computer Vision–ECCV 2020 Workshops: Glasgow, UK, August 23–28, 2020, Proceedings, Part I 16, pp.  55–70. Springer, 2020.
  28. Run-off election: Improved provable defense against data poisoning attacks. arXiv preprint arXiv:2302.02300, 2023.
  29. Antidote: understanding and defending against poisoning of anomaly detectors. In Proceedings of the 9th ACM SIGCOMM Conference on Internet Measurement, pp.  1–14, 2009.
  30. Hidden trigger backdoor attacks. In Proceedings of the AAAI conference on artificial intelligence, volume 34, pp.  11957–11965, 2020.
  31. Just how toxic is data poisoning? a unified benchmark for backdoor and data poisoning attacks. In International Conference on Machine Learning, pp. 9389–9398. PMLR, 2021.
  32. Poison frogs! targeted clean-label poisoning attacks on neural networks. Advances in neural information processing systems, 31, 2018.
  33. Deep unsupervised learning using nonequilibrium thermodynamics. In International conference on machine learning, pp. 2256–2265. PMLR, 2015.
  34. Sleeper agent: Scalable hidden trigger backdoors for neural networks trained from scratch. Advances in Neural Information Processing Systems, 35:19165–19178, 2022.
  35. Certified defenses for data poisoning attacks. Advances in neural information processing systems, 30, 2017.
  36. When does machine learning FAIL? generalized transferability for evasion and poisoning attacks. In 27th USENIX Security Symposium (USENIX Security 18), pp. 1299–1316, Baltimore, MD, August 2018. USENIX Association. ISBN 978-1-939133-04-5. URL https://www.usenix.org/conference/usenixsecurity18/presentation/suciu.
  37. Differentially private learning needs better features (or much more data). arXiv preprint arXiv:2011.11660, 2020.
  38. Spectral signatures in backdoor attacks. Advances in neural information processing systems, 31, 2018.
  39. Robustness may be at odds with accuracy. arXiv preprint arXiv:1805.12152, 2018.
  40. Label-consistent backdoor attacks. arXiv preprint arXiv:1912.02771, 2019.
  41. Improved certified defenses against data poisoning with (deterministic) finite aggregation. In International Conference on Machine Learning, pp. 22769–22783. PMLR, 2022.
  42. Rab: Provable robustness against backdoor attacks. In 2023 IEEE Symposium on Security and Privacy (SP), pp. 1311–1328. IEEE, 2023.
  43. Differentially private learning needs hidden state (or much faster convergence). Advances in Neural Information Processing Systems, 35:703–715, 2022.
  44. Theoretically principled trade-off between robustness and accuracy. In International conference on machine learning, pp. 7472–7482. PMLR, 2019.
  45. Bagflip: A certified defense against data poisoning. Advances in Neural Information Processing Systems, 35:31474–31483, 2022.
  46. Transferable clean-label poisoning attacks on deep neural nets. In International Conference on Machine Learning, pp. 7614–7623, 2019.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Sanghyun Hong (38 papers)
  2. Nicholas Carlini (101 papers)
  3. Alexey Kurakin (19 papers)
Citations (3)

Summary

We haven't generated a summary for this paper yet.