Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Do You Trust Your Model? Emerging Malware Threats in the Deep Learning Ecosystem (2403.03593v2)

Published 6 Mar 2024 in cs.CR and cs.AI

Abstract: Training high-quality deep learning models is a challenging task due to computational and technical requirements. A growing number of individuals, institutions, and companies increasingly rely on pre-trained, third-party models made available in public repositories. These models are often used directly or integrated in product pipelines with no particular precautions, since they are effectively just data in tensor form and considered safe. In this paper, we raise awareness of a new machine learning supply chain threat targeting neural networks. We introduce MaleficNet 2.0, a novel technique to embed self-extracting, self-executing malware in neural networks. MaleficNet 2.0 uses spread-spectrum channel coding combined with error correction techniques to inject malicious payloads in the parameters of deep neural networks. MaleficNet 2.0 injection technique is stealthy, does not degrade the performance of the model, and is robust against removal techniques. We design our approach to work both in traditional and distributed learning settings such as Federated Learning, and demonstrate that it is effective even when a reduced number of bits is used for the model parameters. Finally, we implement a proof-of-concept self-extracting neural network malware using MaleficNet 2.0, demonstrating the practicality of the attack against a widely adopted machine learning framework. Our aim with this work is to raise awareness against these new, dangerous attacks both in the research community and industry, and we hope to encourage further research in mitigation techniques against such threats.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (66)
  1. A. Baevski, W.-N. Hsu, A. Conneau, and M. Auli, “Unsupervised speech recognition,” Advances in Neural Information Processing Systems, vol. 34, 2021.
  2. D. Baranchuk, A. Voynov, I. Rubachev, V. Khrulkov, and A. Babenko, “Label-efficient semantic segmentation with diffusion models,” International Conference on Learning Representations, 2021.
  3. J. Behrmann, W. Grathwohl, R. T. Chen, D. Duvenaud, and J.-H. Jacobsen, “Invertible residual networks,” International Conference on Machine Learning, pp. 573–582, 2019.
  4. K. Bonawitz, V. Ivanov, B. Kreuter, A. Marcedone, H. B. McMahan, S. Patel, D. Ramage, A. Segal, and K. Seth, “Practical secure aggregation for privacy-preserving machine learning,” Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 1175–1191, 2017.
  5. T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, and D. Amodei, “Language models are few-shot learners,” Advances in Neural Information Processing Systems, 2020.
  6. A. Cheddad, J. Condell, K. Curran, and P. Mc Kevitt, “Digital image steganography: Survey and analysis of current methods,” Signal Processing, 2010.
  7. F. Chollet, “Xception: Deep learning with depthwise separable convolutions,” 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017.
  8. S. Christian, W. Liu, and Y. Jia, “Going deeper with convolutions,” IEEE Conference on Computer Vision and Pattern Recognition, 2015.
  9. Cyberark, “Amsi bypass: Patching technique,” https://www.cyberark.com/resources/threat-research-blog/amsi-bypass-patching-technique, 2018.
  10. G. E. Dahl, D. Yu, L. Deng, and A. Acero, “Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition,” IEEE Transactions on Audio, Speech, and Language Processing, 2012.
  11. F. De Gaspari, D. Hitaj, G. Pagnotta, L. De Carli, and L. V. Mancini, “The naked sun: Malicious cooperation between benign-looking processes,” International Conference on Applied Cryptography and Network Security, pp. 254–274, 2019.
  12. ——, “Evading behavioral classifiers: a comprehensive analysis on evading ransomware detection techniques,” Neural Computing and Applications, 2022.
  13. ——, “Reliable detection of compressed and encrypted data,” Neural Computing and Applications, 2022.
  14. J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009.
  15. J. Devlin, M. Chang, K. Lee, and K. Toutanova, “BERT: pre-training of deep bidirectional transformers for language understanding,” NAACL-HLT, 2019.
  16. H. Face, “Llama 2 7b - ggml,” https://huggingface.co/TheBloke/Llama-2-7B-GGML, 2023.
  17. P. Foundation, “PyTorch,” https://pytorch.org, 2023.
  18. ——, “PyTorch documentation,” https://pytorch.org/docs/stable/generated/torch.load.html, 2023.
  19. A. Graves, A. Mohamed, and G. Hinton, “Speech recognition with deep recurrent neural networks,” 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, 2013.
  20. T. Gu, K. Liu, B. Dolan-Gavitt, and S. Garg, “Badnets: Evaluating backdooring attacks on deep neural networks,” IEEE Access, 2019.
  21. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016.
  22. D. Hendrycks, C. Burns, S. Basart, A. Zou, M. Mazeika, D. Song, and J. Steinhardt, “Measuring massive multitask language understanding,” International Conference on Learning Representations, 2021. [Online]. Available: https://openreview.net/forum?id=d7KBjmI3GmQ
  23. B. Hitaj, G. Ateniese, and F. Pérez-Cruz, “Deep models under the gan: information leakage from collaborative deep learning,” Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 603–618, 2017.
  24. B. Hitaj, P. Gasti, G. Ateniese, and F. Perez-Cruz, “Passgan: A deep learning approach for password guessing,” Applied Cryptography and Network Security, 2019.
  25. D. Hitaj, G. Pagnotta, F. D. Gaspari, L. D. Carli, and L. V. Mancini, “Minerva: A file-based ransomware detector,” 2023.
  26. D. Hitaj, G. Pagnotta, B. Hitaj, L. V. Mancini, and F. Perez-Cruz, “Maleficnet: Hiding malware into deep neural networks using spread-spectrum channel coding,” European Symposium on Research in Computer Security, pp. 425–444, 2022.
  27. G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017.
  28. D. Kalamkar, D. Mudigere, N. Mellempudi, D. Das, K. Banerjee, S. Avancha, D. T. Vooturi, N. Jammalamadaka, J. Huang, H. Yuen et al., “A study of bfloat16 for deep learning training,” arXiv preprint arXiv:1905.12322, 2019.
  29. T. Karras, S. Laine, and T. Aila, “A style-based generator architecture for generative adversarial networks,” 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4396–4405, 2019.
  30. A. Krizhevsky and G. Hinton, “Learning multiple layers of features from tiny images,” University of Toronto, Toronto, Ontario, Tech. Rep., 2009.
  31. A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Proceedings of the 25th International Conference on Neural Information Processing Systems, 2012.
  32. Y. LeCun and C. Cortes, “MNIST handwritten digit database,” http://yann.lecun.com/exdb/mnist/, 2010.
  33. Y. Li, Y. Li, B. Wu, L. Li, R. He, and S. Lyu, “Invisible backdoor attack with sample-specific triggers,” in Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 16 463–16 472.
  34. K. Liu, B. Dolan-Gavitt, and S. Garg, “Fine-pruning: Defending against backdooring attacks on deep neural networks,” Research in Attacks, Intrusions, and Defenses, 2018.
  35. T. Liu, Z. Liu, Q. Liu, W. Wen, W. Xu, and M. Li, “Stegonet: Turn deep neural network into a stegomalware,” Annual Computer Security Applications Conference, 2020.
  36. W. S. McCulloch and W. Pitts, “A logical calculus of the ideas immanent in nervous activity,” The bulletin of mathematical biophysics, vol. 5, no. 4, pp. 115–133, 1943.
  37. B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, “Communication-efficient learning of deep networks from decentralized data,” Artificial Intelligence and Statistics, pp. 1273–1282, 2017.
  38. B. McMahan and D. Ramage, “Federated learning: Collaborative machine learning without centralized training data,” https://ai.googleblog.com/2017/04/federated-learning-collaborative.html, 2017.
  39. ——, “Federated learning: Collaborative machine learning without centralized training data,” https://blog.research.google/2017/04/federated-learning-collaborative.html, 2023.
  40. MDSec, “Bypassing user-mode hooks and direct invocation of system calls for red teams,” https://www.mdsec.co.uk/2020/12/bypassing-user-mode-hooks-and-direct-invocation-of-system-calls-for-red-teams/, 2023.
  41. S. Merity, C. Xiong, J. Bradbury, and R. Socher, “Pointer sentinel mixture models,” CoRR, vol. abs/1609.07843, 2016.
  42. Metadefender, “Multiple security engines,” [Online; accessed September 2023]. [Online]. Available: http://www.metadefender.com/
  43. Y. Nativ, “thezoo - a live malware repository,” [Online; accessed November 2021]. [Online]. Available: https://thezoo.morirt.com/
  44. P. O’Kane, S. Sezer, and K. McLaughlin, “Obfuscation: The hidden malware,” IEEE Security & Privacy, vol. 9, no. 5, pp. 41–47, 2011.
  45. OpenAI, “Gpt-4 technical report,” ArXiv, vol. abs/2303.08774, 2023.
  46. G. Pagnotta, F. De Gaspari, D. Hitaj, M. Andreolini, M. Colajanni, and L. V. Mancini, “Dolos: A novel architecture for moving target defense,” IEEE Transactions on Information Forensics and Security, vol. 18, pp. 5890–5905, 2023.
  47. G. Pagnotta, D. Hitaj, F. De Gaspari, and L. V. Mancini, “Passflow: Guessing passwords with generative flows,” 52nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2022.
  48. K. J. Piczak, “ESC: Dataset for Environmental Sound Classification,” Proceedings of the 23rd Annual ACM Conference on Multimedia, pp. 1015–1018. [Online]. Available: http://dl.acm.org/citation.cfm?doid=2733373.2806390
  49. M. Piskozub, F. De Gaspari, F. Barr-Smith, L. Mancini, and I. Martinovic, “Malphase: fine-grained malware detection using network flow data,” Proceedings of the 2021 ACM Asia conference on computer and communications security, pp. 774–786, 2021.
  50. R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High-resolution image synthesis with latent diffusion models,” Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022.
  51. M. Rupf and J. L. Massey, “Optimum sequence multisets for synchronous code-division multiple-access channels,” IEEE Transactions on Information Theory, 1994.
  52. A. Sharma, B. B. Gupta, A. K. Singh, and V. Saraswat, “Orchestration of apt malware evasive manoeuvers employed for eluding anti-virus and sandbox defense,” Computers & Security, vol. 115, p. 102627, 2022.
  53. Shellz.club, “A novel method for bypassing etw,” https://shellz.club/posts/a-novel-method-for-bypass-ETW/, 2023.
  54. R. Shokri and V. Shmatikov, “Privacy-preserving deep learning,” Proceedings of the 22nd ACM SIGSAC conference on computer and communications security, pp. 1310–1321, 2015.
  55. K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” 2014.
  56. G. Suarez-Tangil, J. E. Tapiador, and P. Peris-Lopez, “Stegomalware: Playing hide and seek with malicious components in smartphone apps,” Information Security and Cryptology, 2015.
  57. H. Touvron, L. Martin, K. Stone, P. Albert, A. Almahairi, Y. Babaei, N. Bashlykov, S. Batra, P. Bhargava, S. Bhosale, D. Bikel, L. Blecher, C. C. Ferrer, M. Chen, G. Cucurull, D. Esiobu, J. Fernandes, J. Fu, W. Fu, B. Fuller, C. Gao, V. Goswami, N. Goyal, A. Hartshorn, S. Hosseini, R. Hou, H. Inan, M. Kardas, V. Kerkez, M. Khabsa, I. Kloumann, A. Korenev, P. S. Koura, M.-A. Lachaux, T. Lavril, J. Lee, D. Liskovich, Y. Lu, Y. Mao, X. Martinet, T. Mihaylov, P. Mishra, I. Molybog, Y. Nie, A. Poulton, J. Reizenstein, R. Rungta, K. Saladi, A. Schelten, R. Silva, E. M. Smith, R. Subramanian, X. E. Tan, B. Tang, R. Taylor, A. Williams, J. X. Kuan, P. Xu, Z. Yan, I. Zarov, Y. Zhang, A. Fan, M. Kambadur, S. Narang, A. Rodriguez, R. Stojnic, S. Edunov, and T. Scialom, “Llama 2: Open foundation and fine-tuned chat models,” 2023.
  58. S. Vaidya, “Openstego,” [Online; accessed June 2023]. [Online]. Available: https://github.com/syvaidya/openstego/
  59. ——, “Capacity region of gaussian cdma channels: The symbol synchronous case,” Proceedings of the 24th Allerton Conference, 1986.
  60. ——, “Recent results on the capacity of wideband channels in the low-power regime,” IEEE Wireless Communications, 2002.
  61. P. Viswanath and V. Anantharam, “Optimal sequences and sum capacity of synchronous cdma systems,” IEEE Transactions on Information Theory, 1999.
  62. Z. Wang, C. Liu, and X. Cui, “Evilmodel: Hiding malware inside of neural network models,” 2021 IEEE Symposium on Computers and Communications, 2021.
  63. Z. Wang, C. Liu, X. Cui, J. Yin, and X. Wang, “Evilmodel 2.0: Bringing neural network models into malware attacks,” Computers and Security, 2022.
  64. G. Xiao, J. Lin, M. Seznec, H. Wu, J. Demouth, and S. Han, “Smoothquant: Accurate and efficient post-training quantization for large language models,” International Conference on Machine Learning, pp. 38 087–38 099, 2023.
  65. H. Xiao, K. Rasul, and R. Vollgraf, “Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms,” ArXiv, vol. abs/1708.07747, 2017.
  66. W. Zhang, M. Zhai, Z. Huang, C. Liu, W. Li, and Y. Cao, “Towards end-to-end speech recognition with deep multipath convolutional neural networks,” Intelligent Robotics and Applications, 2019.
Citations (3)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com