Explainable Deep Learning Models for Dynamic and Online Malware Classification
Abstract: In recent years, there has been a significant surge in malware attacks, necessitating more advanced preventive measures and remedial strategies. While several successful AI-based malware classification approaches exist categorized into static, dynamic, or online analysis, most successful AI models lack easily interpretable decisions and explanations for their processes. Our paper aims to delve into explainable malware classification across various execution environments (such as dynamic and online), thoroughly analyzing their respective strengths, weaknesses, and commonalities. To evaluate our approach, we train Feed Forward Neural Networks (FFNN) and Convolutional Neural Networks (CNN) to classify malware based on features obtained from dynamic and online analysis environments. The feature attribution for malware classification is performed by explainability tools, SHAP, LIME and Permutation Importance. We perform a detailed evaluation of the calculated global and local explanations from the experiments, discuss limitations and, ultimately, offer recommendations for achieving a balanced approach.
- K. Aryal, M. Gupta, and M. Abdelsalam, “Analysis of label-flip poisoning attack on machine learning based malware detector,” in 2022 IEEE International Conference on Big Data (Big Data). IEEE, 2022, pp. 4236–4245.
- K. Aryal, M. Gupta, M. Abdelsalam, and M. Saleh, “Intra-section code cave injection for adversarial evasion attacks on windows pe malware file,” arXiv preprint arXiv:2403.06428, 2024.
- S. Tobiyama, Y. Yamaguchi, H. Shimada, T. Ikuse, and T. Yagi, “Malware detection with deep neural network using process behavior,” in 2016 IEEE 40th Annual Computer Software and Applications Conference (COMPSAC), vol. 2, 2016, pp. 577–582.
- A. Rahali, A. H. Lashkari, G. Kaur, L. Taheri, F. Gagnon, and F. Massicotte, “Didroid: Android malware classification and characterization using deep image learning,” in 10th International Conference on Communication and Network Security (ICCNS2020), Tokyo, Japan, November 2020, pp. 70–82.
- D. S. Keyes, B. Li, G. Kaur, A. H. Lashkari et al., “Entroplyzer: Android malware classification and characterization using entropy analysis of dynamic characteristics,” in 2021 Reconciling Data Analytics, Automation, Privacy, and Security: A Big Data Challenge (RDAAPS), 2021, pp. 1–12.
- H. Manthena, J. Kimmel, M. Abdelsalam, and M. Gupta, “Analyzing and explaining black-box models for online malware detection,” IEEE Access, vol. 11, pp. 25 237–25 252, 2023.
- A. Brown, M. Gupta, and M. Abdelsalam, “Automated machine learning for deep learning based malware detection,” Computers & Security, 2024.
- K. Aryal, M. Gupta, and M. Abdelsalam, “A survey on adversarial attacks for malware analysis,” arXiv preprint arXiv:2111.08223, 2022.
- S. Karapoola, N. Singh, C. Rebeiro, and K. V., “Radar: A real-world dataset for ai powered run-time detection of cyber-attacks,” in Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 2022, pp. 3222–3232.
- A. Blanco-Justicia and J. Domingo-Ferrer, “Machine learning explainability through comprehensible decision trees,” in International Cross-Domain Conference for Machine Learning and Knowledge Extraction. Springer, 2019, pp. 15–26.
- S. M. Lundberg and S.-I. Lee, “A unified approach to interpreting model predictions,” in Advances in Neural Information Processing Systems, vol. 30, 2017.
- M. T. Ribeiro, S. Singh, and C. Guestrin, “’why should i trust you?’ explaining the predictions of any classifier,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016, pp. 1135–1144.
- L. Breiman, “Random forests,” Machine Learning, vol. 45, no. 1, p. 5–32, 2001.
- M. Schofield et al., “Convolutional neural network for malware classification based on api call sequence,” in 8th International Conference on Artificial Intelligence and Applications (AIAP), 2021, pp. 85–98.
- L. Pirch, A. Warnecke, C. Wressnegger, and K. Rieck, “Tagvet: Vetting malware tags using explainable machine learning,” in Proceedings of the 14th European Workshop on Systems Security, 2021, pp. 34–40.
- R. Alenezi and S. A. Ludwig, “Explainability of cybersecurity threats data using shap,” in 2021 IEEE Symposium Series on Computational Intelligence (SSCI), 2021, pp. 01–10.
- J. C. Kimmell, M. Abdelsalam, and M. Gupta, “Analyzing machine learning approaches for online malware detection in cloud,” in 2021 IEEE International Conference on Smart Computing, 2021, pp. 189–196.
- P. Prasse, J. Brabec, J. Kohout, M. Kopp et al., “Learning explainable representations of malware behavior,” in Machine Learning and Knowledge Discovery in Databases: Applied Data Science Track, 2021, pp. 53–68.
- R. R. Karn, P. Kudva, H. Huang, S. Suneja, and I. M. Elfadel, “Cryptomining detection in container clouds using system calls and explainable machine learning,” IEEE Transactions on Parallel and Distributed Systems, vol. 32, no. 3, pp. 674–691, 2021.
- F. Ullah, A. Alsirhani, M. M. Alshahrani, A. Alomari, H. Naeem, and S. A. Shah, “Explainable malware detection system using transformers-based transfer learning and multi-model visual representation,” Sensors, vol. 22, no. 18, p. 6766, 2022.
- H. Naeem, B. M. Alshammari, and F. Ullah, “Explainable artificial intelligence-based iot device malware detection mechanism using image visualization and fine-tuned cnn-based transfer learning model,” Computational Intelligence and Neuroscience, vol. 2022, 2022.
- P. Brown, A. Brown, M. Gupta, and M. Abdelsalam, “Online malware classification with system-wide system calls in cloud iaas,” in 2022 IEEE 23rd International Conference on Information Reuse and Integration for Data Science (IRI), 2022, pp. 146–151.
- Y. Lin and X. Chang, “Towards interpreting ml-based automated malware detection models: A survey,” arXiv preprint arXiv:2101.06232, 2021.
- C. C. for Cyber Security. (2020) CCCS-CIC-AndMal2020. [Online]. Available: https://www.unb.ca/cic/datasets/andmal2020.html
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.