Enhancing Efficiency and Privacy in Memory-Based Malware Classification through Feature Selection (2310.00516v2)
Abstract: Malware poses a significant security risk to individuals, organizations, and critical infrastructure by compromising systems and data. Leveraging memory dumps that offer snapshots of computer memory can aid the analysis and detection of malicious content, including malware. To improve the efficacy and address privacy concerns in malware classification systems, feature selection can play a critical role as it is capable of identifying the most relevant features, thus, minimizing the amount of data fed to classifiers. In this study, we employ three feature selection approaches to identify significant features from memory content and use them with a diverse set of classifiers to enhance the performance and privacy of the classification task. Comprehensive experiments are conducted across three levels of malware classification tasks: i) binary-level benign or malware classification, ii) malware type classification (including Trojan horse, ransomware, and spyware), and iii) malware family classification within each family (with varying numbers of classes). Results demonstrate that the feature selection strategy, incorporating mutual information and other methods, enhances classifier performance for all tasks. Notably, selecting only 25\% and 50\% of input features using Mutual Information and then employing the Random Forest classifier yields the best results. Our findings reinforce the importance of feature selection for malware classification and provide valuable insights for identifying appropriate approaches. By advancing the effectiveness and privacy of malware classification systems, this research contributes to safeguarding against security threats posed by malicious software.
- T. Carrier, P. Victor, A. Tekeoglu, and A. H. Lashkari, “Detecting obfuscated malware using memory feature engineering.,” in ICISSP, pp. 177–188, 2022.
- J. Demme, M. Maycock, J. Schmitz, A. Tang, A. Waksman, S. Sethumadhavan, and S. Stolfo, “On the feasibility of online malware detection with performance counters,” ACM SIGARCH computer architecture news, vol. 41, no. 3, pp. 559–570, 2013.
- “Triton: hackers take out safety systems in ’watershed’ attack on energy plant,” 2017. https://www.theguardian.com/technology/2017/dec/15/triton-hackers-malware-attack-safety-systems-energy-plant.
- “Fbi: Healthcare hit with most ransomware attacks of any critical sector,” 2023. https://www.chiefhealthcareexecutive.com/view/fbi-healthcare-hit-with-most-ransomware-attacks-of-any-critical-sector.
- R. Sihwail, K. Omar, K. A. Zainol Ariffin, and S. Al Afghani, “Malware detection approach based on artifacts in memory image and dynamic analysis,” Applied Sciences, vol. 9, no. 18, p. 3680, 2019.
- Y. Ye, T. Li, D. Adjeroh, and S. S. Iyengar, “A survey on malware detection using data mining techniques,” ACM Computing Surveys (CSUR), vol. 50, no. 3, pp. 1–40, 2017.
- Y. Cheng, W. Fan, W. Huang, and J. An, “A shellcode detection method based on full native api sequence and support vector machine,” in IOP Conference Series: Materials Science and Engineering, vol. 242, p. 012124, IOP Publishing, 2017.
- C. Rathnayaka and A. Jamdagni, “An efficient approach for advanced malware analysis using memory forensic technique,” in 2017 IEEE Trustcom/BigDataSE/ICESS, pp. 1145–1150, IEEE, 2017.
- Y. Dai, H. Li, Y. Qian, and X. Lu, “A malware classification method based on memory dump grayscale image,” Digital Investigation, vol. 27, pp. 30–37, 2018.
- A. H. Lashkari, B. Li, T. L. Carrier, and G. Kaur, “Volmemlyzer: Volatile memory analyzer for malware classification using feature engineering,” in 2021 Reconciling Data Analytics, Automation, Privacy, and Security: A Big Data Challenge (RDAAPS), pp. 1–8, IEEE, 2021.
- M. S. Abbasi, H. Al-Sahaf, M. Mansoori, and I. Welch, “Behavior-based ransomware classification: A particle swarm optimization wrapper-based approach for feature selection,” Applied Soft Computing, vol. 121, p. 108744, 2022.
- T. Tsafrir, A. Cohen, E. Nir, and N. Nissim, “Efficient feature extraction methodologies for unknown mp4-malware detection using machine learning algorithms,” Expert Systems with Applications, vol. 219, p. 119615, 2023.
- A. Lehavi and S. Kim, “Feature reduction method comparison towards explainability and efficiency in cybersecurity intrusion detection systems,” in 2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 1326–1333, IEEE, 2022.
- M. Dener, G. Ok, and A. Orman, “Malware detection using memory analysis data in big data environment,” Applied Sciences, vol. 12, no. 17, p. 8604, 2022.
- A. S. Bozkir, E. Tahillioglu, M. Aydos, and I. Kara, “Catch them alive: A malware detection approach through memory forensics, manifold learning and computer vision,” Computers & Security, vol. 103, p. 102166, 2021.
- R. Mosli, R. Li, B. Yuan, and Y. Pan, “Automated malware detection using artifacts in forensic memory images,” in 2016 IEEE Symposium on Technologies for Homeland Security (HST), pp. 1–6, IEEE, 2016.
- R. Mosli, R. Li, B. Yuan, and Y. Pan, “A behavior-based approach for malware detection,” in Advances in Digital Forensics XIII: 13th IFIP WG 11.9 International Conference, Orlando, FL, USA, January 30-February 1, 2017, Revised Selected Papers 13, pp. 187–201, Springer, 2017.
- R. Petrik, B. Arik, and J. M. Smith, “Towards architecture and os-independent malware detection via memory forensics,” in Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pp. 2267–2269, 2018.
- A. Tang, S. Sethumadhavan, and S. J. Stolfo, “Unsupervised anomaly-based malware detection using hardware features,” in Research in Attacks, Intrusions and Defenses: 17th International Symposium, RAID 2014, Gothenburg, Sweden, September 17-19, 2014. Proceedings 17, pp. 109–129, Springer, 2014.
- I. Sharafaldin, A. Gharib, A. H. Lashkari, and A. A. Ghorbani, “Botviz: A memory forensic-based botnet detection and visualization approach,” in 2017 International Carnahan Conference on Security Technology (ICCST), pp. 1–8, IEEE, 2017.
- M. Martín-Pérez, R. J. Rodríguez, and D. Balzarotti, “Pre-processing memory dumps to improve similarity score of windows modules,” Computers & Security, vol. 101, p. 102119, 2021.
- J. Okolica and G. Peterson, “A compiled memory analysis tool,” in Advances in Digital Forensics VI: Sixth IFIP WG 11.9 International Conference on Digital Forensics, Hong Kong, China, January 4-6, 2010, Revised Selected Papers 6, pp. 195–204, Springer, 2010.
- F. Block and A. Dewald, “Linux memory forensics: Dissecting the user space process heap,” Digital Investigation, vol. 22, pp. S66–S75, 2017.
- F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay, “Scikit-learn: Machine learning in Python,” Journal of Machine Learning Research, vol. 12, pp. 2825–2830, 2011.