Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

EvadeDroid: A Practical Evasion Attack on Machine Learning for Black-box Android Malware Detection (2110.03301v4)

Published 7 Oct 2021 in cs.LG and cs.CR

Abstract: Over the last decade, researchers have extensively explored the vulnerabilities of Android malware detectors to adversarial examples through the development of evasion attacks; however, the practicality of these attacks in real-world scenarios remains arguable. The majority of studies have assumed attackers know the details of the target classifiers used for malware detection, while in reality, malicious actors have limited access to the target classifiers. This paper introduces EvadeDroid, a problem-space adversarial attack designed to effectively evade black-box Android malware detectors in real-world scenarios. EvadeDroid constructs a collection of problem-space transformations derived from benign donors that share opcode-level similarity with malware apps by leveraging an n-gram-based approach. These transformations are then used to morph malware instances into benign ones via an iterative and incremental manipulation strategy. The proposed manipulation technique is a query-efficient optimization algorithm that can find and inject optimal sequences of transformations into malware apps. Our empirical evaluations, carried out on 1K malware apps, demonstrate the effectiveness of our approach in generating real-world adversarial examples in both soft- and hard-label settings. Our findings reveal that EvadeDroid can effectively deceive diverse malware detectors that utilize different features with various feature types. Specifically, EvadeDroid achieves evasion rates of 80%-95% against DREBIN, Sec-SVM, ADE-MA, MaMaDroid, and Opcode-SVM with only 1-9 queries. Furthermore, we show that the proposed problem-space adversarial attack is able to preserve its stealthiness against five popular commercial antiviruses with an average of 79% evasion rate, thus demonstrating its feasibility in the real world.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (73)
  1. Using spatio-temporal information in api calls with machine learning algorithms for malware detection, in: Proceedings of the 2nd ACM Workshop on Security and Artificial Intelligence, 2009, pp. 55–62.
  2. Analysis of machine learning techniques used in behavior-based malware detection, in: 2010 second international conference on advances in computing, control, and telecommunication technologies, IEEE, 2010, pp. 201–203.
  3. Hdm-analyser: a hybrid analysis approach based on data mining techniques for malware detection, Journal of Computer Virology and Hacking Techniques 9 (2013) 77–93.
  4. A malware detection scheme based on mining format information, The Scientific World Journal 2014 (2014).
  5. E. Raff, C. Nicholas, An alternative to ncd for large sequences, lempel-ziv jaccard distance, in: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, 2017, pp. 1007–1015.
  6. A hybrid deep learning image-based analysis for effective malware detection, Journal of Information Security and Applications 47 (2019) 377–389.
  7. Can machine/deep learning classifiers detect zero-day malware with high accuracy?, in: 2019 IEEE international conference on big data (Big Data), IEEE, 2019, pp. 3252–3259.
  8. Sok: Arms race in adversarial malware detection, arXiv preprint arXiv:2005.11671 (2020).
  9. Robust android malware detection system against adversarial attacks using q-learning, Information Systems Frontiers 23 (2021) 867–882.
  10. Securedroid: Enhancing security of machine learning-based detection against adversarial android malware attacks, in: Proceedings of the 33rd Annual Computer Security Applications Conference, 2017, pp. 362–372.
  11. Yes, machine learning can be more secure! a case study on android malware detection, IEEE Transactions on Dependable and Secure Computing 16 (2017) 711–724.
  12. Adversarial examples for malware detection, in: European symposium on research in computer security, Springer, 2017, pp. 62–79.
  13. Droideye: Fortifying security of learning-based classifier against adversarial android malware attacks, in: 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), IEEE, 2018, pp. 782–789.
  14. Adversarial samples on android malware detection systems for iot systems, Sensors 19 (2019) 974.
  15. Ofei: A semi-black-box android adversarial sample attack framework against dlaas, arXiv preprint arXiv:2105.11593 (2021).
  16. When the guard failed the droid: A case study of android malware, arXiv preprint arXiv:2003.14123 (2020).
  17. Enhancing deep neural networks against adversarial malware examples, arXiv preprint arXiv:2004.07919 (2020).
  18. Intriguing properties of adversarial ml attacks in the problem space, in: 2020 IEEE Symposium on Security and Privacy (SP), IEEE, 2020, pp. 1332–1349.
  19. Android hiv: A study of repackaging malware for evading machine-learning detection, IEEE Transactions on Information Forensics and Security 15 (2019) 987–1001.
  20. On the feasibility of adversarial sample creation using the android system api, Information 11 (2020) 433.
  21. Malware detection in adversarial settings: Exploiting feature evolutions and confusions in android apps, in: Proceedings of the 33rd Annual Computer Security Applications Conference, 2017, pp. 288–302.
  22. Copycat: practical adversarial attacks on visualization-based malware detection, arXiv preprint arXiv:1909.09735 (2019).
  23. A combination method for android malware detection based on control flow graphs and machine learning algorithms, IEEE access 7 (2019) 21235–21245.
  24. Mamadroid: Detecting android malware by building markov chains of behavioral models (extended version), ACM Transactions on Privacy and Security (TOPS) 22 (2019) 1–34.
  25. A framework for enhancing deep neural networks against adversarial malware, IEEE Transactions on Network Science and Engineering 8 (2021) 736–750.
  26. D. Li, Q. Li, Adversarial deep ensemble: Evasion attacks and defenses for malware detection, IEEE Transactions on Information Forensics and Security 15 (2020) 3886–3900.
  27. Query-efficient black-box attack against sequence-based malware classifiers, in: Annual Computer Security Applications Conference, 2020, pp. 611–626.
  28. Sparse-rs: a versatile framework for query-efficient sparse black-box adversarial attacks, in: Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, 2022, pp. 6437–6445.
  29. Shadowdroid: Practical black-box attack against ml-based android malware detection, in: 2021 IEEE 27th International Conference on Parallel and Distributed Systems (ICPADS), IEEE, 2021, pp. 629–636.
  30. Drebin: Effective and explainable detection of android malware in your pocket., in: Proceedings of the 21st Annual Network and Distributed System Security Symposium (N⁢D⁢S⁢S𝑁𝐷𝑆𝑆{NDSS}italic_N italic_D italic_S italic_S 2014), volume 14, 2014, pp. 1–15.
  31. Using opcode-sequences to detect malicious android applications, in: 2014 IEEE international conference on communications (ICC), IEEE, 2014, pp. 914–919.
  32. Functionality-preserving black-box optimization of adversarial windows malware, IEEE Transactions on Information Forensics and Security 16 (2021) 3469–3478.
  33. Automatic generation of adversarial examples for interpreting malware classifiers (2022) 990––1003.
  34. Optimization-guided binary diversification to mislead neural networks for malware detection, arXiv preprint arXiv:1912.09064 (2019).
  35. Explaining vulnerabilities of deep learning to adversarial malware binaries, arXiv preprint arXiv:1901.03583 (2019).
  36. GenDroid: A query-efficient black-box android adversarial attack framework, Computers & Security (2023) 103359.
  37. Efficient query-based attack against ml-based android malware detection under zero knowledge setting, arXiv preprint arXiv:2309.01866 (2023).
  38. Black-box adversarial example attack towards fcg based android malware detection under incomplete feature information, in: 32nd U⁢S⁢E⁢N⁢I⁢X𝑈𝑆𝐸𝑁𝐼𝑋{USENIX}italic_U italic_S italic_E italic_N italic_I italic_X Security Symposium (U⁢S⁢E⁢N⁢I⁢X𝑈𝑆𝐸𝑁𝐼𝑋{USENIX}italic_U italic_S italic_E italic_N italic_I italic_X Security), 2023.
  39. Soot - a java bytecode optimization framework, in: Proceedings of the 1999 Conference of the Centre for Advanced Studies on Collaborative Research, (CASCON 1999), IBM Press, 1999, pp. 1–11.
  40. Flowdroid: Precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for android apps, Acm Sigplan Notices 49 (2014) 259–269.
  41. Malware detection using static analysis in android: a review of feco (features, classification, and obfuscation), PeerJ Computer Science 7 (2021) e522.
  42. N-gram opcode analysis for android malware detection, arXiv preprint arXiv:1612.01445 (2016).
  43. Malware analysis using visualized images and entropy graphs, International Journal of Information Security 14 (2015) 1–14.
  44. Automated software transplantation, in: Proceedings of the 2015 International Symposium on Software Testing and Analysis, 2015, pp. 257–269.
  45. Limits of static analysis for malware detection, in: Twenty-Third Annual Computer Security Applications Conference (ACSAC 2007), IEEE, 2007, pp. 421–430.
  46. On evaluating adversarial robustness, arXiv preprint arXiv:1902.06705 (2019).
  47. Unknown malcode detection via text categorization and the imbalance problem, in: 2008 IEEE international conference on intelligence and security informatics, IEEE, 2008, pp. 156–161.
  48. S. Jain, Y. K. Meena, Byte level n–gram analysis for malware detection, in: International Conference on Information Processing, Springer, 2011, pp. 51–59.
  49. Detecting unknown malicious code by applying classification techniques on opcode patterns, Security Informatics 1 (2012) 1–22.
  50. Opcode sequences as representation of executables for data-mining-based unknown malware detection, Information Sciences 231 (2013) 64–82.
  51. Z. Fuyong, Z. Tiezhu, Malware detection and classification based on n-grams attribute similarity, in: 2017 IEEE international conference on computational science and engineering (CSE) and IEEE international conference on embedded and ubiquitous computing (EUC), volume 1, IEEE, 2017, pp. 793–796.
  52. Effectiveness of opcode ngrams for detection of multi family android malware, in: 2015 10th International Conference on Availability, Reliability and Security, IEEE, 2015, pp. 333–340.
  53. Identification of malicious android app using manifest and opcode features, Journal of Computer Virology and Hacking Techniques 13 (2017) 125–138.
  54. An evaluation of n-gram system call sequence in mobile malware detection, ARPN J. Eng. Appl. Sci 11 (2016) 3122–3126.
  55. Evaluation of n-gram based multi-layer approach to detect malware in android, Procedia Computer Science 171 (2020) 1074–1082.
  56. L. Rastrigin, The convergence of the random search method in the extremal control of a many parameter system, Automaton & Remote Control 24 (1963) 1337–1342.
  57. M. Weiser, Program slicing, IEEE Transactions on software engineering (1984) 352–357.
  58. Adversarial detection of flash malware: Limitations and open issues, Computers & Security 96 (2020) 101901.
  59. L. Muñoz-González, E. C. Lupu, The security of machine learning systems, in: AI in Cybersecurity, Springer, 2019, pp. 47–79.
  60. Measuring similarity of android applications via reversing and k-gram birthmarking, in: Proceedings of the 2013 Research in Adaptive and Convergent Systems, 2013, pp. 336–341.
  61. Finding unknown malice in 10 seconds: Mass vetting for new threats at the google-play scale, in: 24th U⁢S⁢E⁢N⁢I⁢X𝑈𝑆𝐸𝑁𝐼𝑋{USENIX}italic_U italic_S italic_E italic_N italic_I italic_X Security Symposium (U⁢S⁢E⁢N⁢I⁢X𝑈𝑆𝐸𝑁𝐼𝑋{USENIX}italic_U italic_S italic_E italic_N italic_I italic_X Security 15), 2015, pp. 659–674.
  62. Droidapiminer: Mining api-level features for robust malware detection in android, in: International conference on security and privacy in communication systems, Springer, 2013, pp. 86–103.
  63. Malware detection based on mining api calls, in: Proceedings of the 2010 ACM symposium on applied computing, 2010, pp. 1020–1025.
  64. Androzoo: Collecting millions of android apps for the research community, in: 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR), IEEE, 2016, pp. 468–471.
  65. A. Salem, Towards accurate labeling of android apps for reliable malware detection, in: Proceedings of the Eleventh ACM Conference on Data and Application Security and Privacy, 2021, pp. 269–280.
  66. T⁢E⁢S⁢S⁢E⁢R⁢A⁢C⁢T𝑇𝐸𝑆𝑆𝐸𝑅𝐴𝐶𝑇{TESSERACT}italic_T italic_E italic_S italic_S italic_E italic_R italic_A italic_C italic_T: Eliminating experimental bias in malware classification across space and time, in: 28th U⁢S⁢E⁢N⁢I⁢X𝑈𝑆𝐸𝑁𝐼𝑋{USENIX}italic_U italic_S italic_E italic_N italic_I italic_X Security Symposium (U⁢S⁢E⁢N⁢I⁢X𝑈𝑆𝐸𝑁𝐼𝑋{USENIX}italic_U italic_S italic_E italic_N italic_I italic_X Security 19), 2019, pp. 729–746.
  67. Shallow security: On the creation of adversarial variants to evade machine learning-based malware detectors, in: Proceedings of the 3rd Reversing and Offensive-oriented Trends Symposium, 2019, pp. 1–9.
  68. Practical black-box attacks against machine learning (2017) 506––519.
  69. The rise of machine learning for detection and classification of malware: Research developments, trends and challenges, Journal of Network and Computer Applications 153 (2020) 102526.
  70. https://s2lab.cs.ucl.ac.uk/projects/intriguing, 2020. Accessed: 2022-09-11.
  71. https://bitbucket.org/gianluca_students/mamadroid_code, 2020. Accessed: 2022-09-11.
  72. https://github.com/deqangss/adv-dnn-ens-malware, 2020. Accessed: 2022-09-11.
  73. GenAttack: Practical black-box attacks with gradient-free optimization, in: Proceedings of the genetic and evolutionary computation conference, 2019, pp. 1111–1119.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Hamid Bostani (5 papers)
  2. Veelasha Moonsamy (14 papers)
Citations (26)

Summary

We haven't generated a summary for this paper yet.