Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
117 tokens/sec
GPT-4o
8 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Beyond Full Poisoning: Effective Availability Attacks with Partial Perturbation (2407.02437v2)

Published 2 Jul 2024 in cs.LG, cs.CR, and cs.CV

Abstract: The widespread use of publicly available datasets for training machine learning models raises significant concerns about data misuse. Availability attacks have emerged as a means for data owners to safeguard their data by designing imperceptible perturbations that degrade model performance when incorporated into training datasets. However, existing availability attacks are ineffective when only a portion of the data can be perturbed. To address this challenge, we propose a novel availability attack approach termed Parameter Matching Attack (PMA). PMA is the first availability attack capable of causing more than a 30\% performance drop when only a portion of data can be perturbed. PMA optimizes perturbations so that when the model is trained on a mixture of clean and perturbed data, the resulting model will approach a model designed to perform poorly. Experimental results across four datasets demonstrate that PMA outperforms existing methods, achieving significant model performance degradation when a part of the training data is perturbed. Our code is available in the supplementary materials.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (21)
  1. How photos of your kids are powering surveillance technology. International New York Times, pages NA–NA, 2019.
  2. Unlearnable examples: Making personal data unexploitable. arXiv preprint arXiv:2101.04898, 2021.
  3. Autoregressive perturbations for data poisoning. Advances in Neural Information Processing Systems, 35:27374–27386, 2022.
  4. Learning to confuse: generating training time adversarial data with auto-encoder. Advances in Neural Information Processing Systems, 32, 2019.
  5. Tensorclog: An imperceptible poisoning attack on deep neural network applications. IEEE Access, 7:41498–41506, 2019.
  6. Availability attacks create shortcuts. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 2367–2376, 2022.
  7. Image shortcut squeezing: Countering perturbative availability poisons with compression. In Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett, editors, International Conference on Machine Learning, ICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA, volume 202 of Proceedings of Machine Learning Research, pages 22473–22487. PMLR, 2023.
  8. Witches’ brew: Industrial scale data poisoning via gradient matching. arXiv preprint arXiv:2009.02276, 2020.
  9. Sleeper agent: Scalable hidden trigger backdoors for neural networks trained from scratch. Advances in Neural Information Processing Systems, 35:19165–19178, 2022.
  10. Poison frogs! targeted clean-label poisoning attacks on neural networks. Advances in neural information processing systems, 31, 2018.
  11. Robust unlearnable examples: Protecting data against adversarial learning. arXiv preprint arXiv:2203.14533, 2022.
  12. Adversarial examples make strong poisons. Advances in Neural Information Processing Systems, 34:30339–30351, 2021.
  13. Self-ensemble protection: Training checkpoints are good data protectors. arXiv preprint arXiv:2211.12005, 2022.
  14. Preventing unauthorized use of proprietary data: Poisoning for secure dataset release. arXiv preprint arXiv:2103.02683, 2021.
  15. Dataset distillation by matching training trajectories. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4750–4759, 2022.
  16. Reading digits in natural images with unsupervised feature learning. 2011.
  17. Learning multiple layers of features from tiny images. 2009.
  18. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009.
  19. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  20. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
  21. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4700–4708, 2017.
Citations (1)

Summary

We haven't generated a summary for this paper yet.