Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 29 tok/s Pro
GPT-5 High 39 tok/s Pro
GPT-4o 112 tok/s Pro
Kimi K2 188 tok/s Pro
GPT OSS 120B 442 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Towards Scalable and Robust Model Versioning (2401.09574v2)

Published 17 Jan 2024 in cs.LG and cs.CR

Abstract: As the deployment of deep learning models continues to expand across industries, the threat of malicious incursions aimed at gaining access to these deployed models is on the rise. Should an attacker gain access to a deployed model, whether through server breaches, insider attacks, or model inversion techniques, they can then construct white-box adversarial attacks to manipulate the model's classification outcomes, thereby posing significant risks to organizations that rely on these models for critical tasks. Model owners need mechanisms to protect themselves against such losses without the necessity of acquiring fresh training data - a process that typically demands substantial investments in time and capital. In this paper, we explore the feasibility of generating multiple versions of a model that possess different attack properties, without acquiring new training data or changing model architecture. The model owner can deploy one version at a time and replace a leaked version immediately with a new version. The newly deployed model version can resist adversarial attacks generated leveraging white-box access to one or all previously leaked versions. We show theoretically that this can be accomplished by incorporating parameterized hidden distributions into the model training data, forcing the model to learn task-irrelevant features uniquely defined by the chosen data. Additionally, optimal choices of hidden distributions can produce a sequence of model versions capable of resisting compound transferability attacks over time. Leveraging our analytical insights, we design and implement a practical model versioning method for DNN classifiers, which leads to significant robustness improvements over existing methods. We believe our work presents a promising direction for safeguarding DNN services beyond their initial deployment.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (36)
  1. S. G. Finlayson et al., “Adversarial attacks on medical machine learning,” Science, no. 6433, 2019.
  2. X. Ma et al., “Understanding adversarial attacks on deep learning based medical image analysis systems,” Pattern Recognition, p. 107332, 2021.
  3. M. B. Rahman, H. A. Mustafa, and M. D. Hossain, “Towards evaluating robustness of violence detection in videos using cross-domain transferability,” Journal of Information Security and Applications, 2023.
  4. A. N. Bhagoji, W. He, B. Li, and D. Song, “Practical black-box attacks on deep neural networks using efficient query mechanisms,” in Proc. of ECCV, 2018.
  5. P. Tschandl, C. Rosendahl, and H. Kittler, “The ham10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions,” Scientific data, 2018.
  6. A. Demontis et al., “Why do adversarial attacks transfer? explaining transferability of evasion and poisoning attacks,” in Proc. of USENIX Security, 2019.
  7. Y. Liu, X. Chen, C. Liu, and D. Song, “Delving into transferable adversarial examples and black-box attacks,” in Proc. of ICLR, 2017.
  8. H. Yang et al., “Dverge: diversifying vulnerabilities for enhanced robust generation of ensembles,” Proc. of NeurIPS, 2020.
  9. Z. Yang et al., “Trs: Transferability reduced ensemble via promoting gradient diversity and model smoothness,” Proc. of NeurIPS, 2021.
  10. H. Dbouk and N. Shanbhag, “Adversarial vulnerability of randomized ensembles,” in Proc. of ICML, 2022.
  11. S. Shan, W. Ding, E. Wenger, H. Zheng, and B. Y. Zhao, “Post-breach recovery: Protection against white-box adversarial examples for leaked dnn models,” in Proc. of CCS, 2022.
  12. I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” in Proc. of ICLR, 2014.
  13. C. Szegedy et al., “Intriguing properties of neural networks,” in Proc. of ICLR, 2014.
  14. W. Wu, Y. Su, M. R. Lyu, and I. King, “Improving the transferability of adversarial samples with adversarial transformations,” in Proc. of CVPR, 2021.
  15. F. Suya, J. Chi, D. Evans, and Y. Tian, “Hybrid batch attacks: Finding black-box adversarial examples with limited queries,” in Proc. of USENIX Security, 2020.
  16. N. Inkawhich, K. J. Liang, L. Carin, and Y. Chen, “Transferable perturbations of deep feature distributions,” in Proc. of ICLR, 2020.
  17. J. Springer, M. Mitchell, and G. Kenyon, “A little robustness goes a long way: Leveraging robust features for targeted transfer attacks,” Proc. of NeurIPS, 2021.
  18. C. Cianfarani et al., “Understanding robust learning through the lens of representation similarities,” Proc. of NeurIPS, vol. 35, 2022.
  19. C. Wiedeman and G. Wang, “Disrupting adversarial transferability in deep neural networks,” Patterns, 2022.
  20. A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards deep learning models resistant to adversarial attacks,” arXiv preprint arXiv:1706.06083, 2017.
  21. GoogleCloud, “Mlops: Continuous delivery and automation pipelines in machine learning,” https://cloud.google.com/architecture/mlops-continuous-delivery-and-automation-pipelines-in-machine-learning, 2023.
  22. Stable Diffusion, “Hugging face stable diffusion,” https://huggingface.co/CompVis/stable-diffusion, accessed: 2023-24-01.
  23. T. Xu et al., “Deep entity classification: Abusive account detection for online social networks,” in Proc. of USENIX Security, 2021.
  24. S. Shan, A. N. Bhagoji, H. Zheng, and B. Y. Zhao, “Poison forensics: Traceback of data poisoning attacks in neural networks,” in Proc. of USENIX Security, 2022.
  25. A. Shafahi et al., “Poison frogs! targeted clean-label poisoning attacks on neural networks,” arXiv preprint arXiv:1804.00792, 2018.
  26. I. Goodfellow et al., “Generative adversarial nets,” Proc. of NeurIPS, 2014.
  27. T. Karras, T. Aila, S. Laine, and J. Lehtinen, “Progressive growing of gans for improved quality, stability, and variation,” Proc. of ICLR, 2017.
  28. A. Krizhevsky and G. Hinton, “Learning multiple layers of features from tiny images,” Tech. Rep., 2009.
  29. “https://www.cs.tau.ac.il/~wolf/ytfaces/,” YouTube Faces DB.
  30. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. of CVPR, 2016.
  31. K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
  32. G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” in Proc. of CVPR, 2017.
  33. A. Kurakin, I. J. Goodfellow, and S. Bengio, “Adversarial examples in the physical world,” in Artificial intelligence safety and security.   Chapman and Hall/CRC, 2018.
  34. N. Carlini and D. Wagner, “Towards evaluating the robustness of neural networks,” in Proc. of IEEE S&P, 2017.
  35. P.-Y. Chen, Y. Sharma, H. Zhang, J. Yi, and C.-J. Hsieh, “Ead: elastic-net attacks to deep neural networks via adversarial examples,” in Proc. of AAAI, 2018.
  36. F. Tramèr et al., “Ensemble adversarial training: Attacks and defenses,” arXiv preprint arXiv:1705.07204, 2017.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 1 tweet and received 1 like.

Upgrade to Pro to view all of the tweets about this paper: