Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Demystifying Behavior-Based Malware Detection at Endpoints (2405.06124v1)

Published 9 May 2024 in cs.CR

Abstract: Machine learning is widely used for malware detection in practice. Prior behavior-based detectors most commonly rely on traces of programs executed in controlled sandboxes. However, sandbox traces are unavailable to the last line of defense offered by security vendors: malware detection at endpoints. A detector at endpoints consumes the traces of programs running on real-world hosts, as sandbox analysis might introduce intolerable delays. Despite their success in the sandboxes, research hints at potential challenges for ML methods at endpoints, e.g., highly variable malware behaviors. Nonetheless, the impact of these challenges on existing approaches and how their excellent sandbox performance translates to the endpoint scenario remain unquantified. We present the first measurement study of the performance of ML-based malware detectors at real-world endpoints. Leveraging a dataset of sandbox traces and a dataset of in-the-wild program traces; we evaluate two scenarios where the endpoint detector was trained on (i) sandbox traces (convenient and accessible); and (ii) endpoint traces (less accessible due to needing to collect telemetry data). This allows us to identify a wide gap between prior methods' sandbox-based detection performance--over 90%--and endpoint performances--below 20% and 50% in (i) and (ii), respectively. We pinpoint and characterize the challenges contributing to this gap, such as label noise, behavior variability, or sandbox evasion. To close this gap, we propose that yield a relative improvement of 5-30% over the baselines. Our evidence suggests that applying detectors trained on sandbox data to endpoint detection -- scenario (i) -- is challenging. The most promising direction is training detectors on endpoint data -- scenario (ii) -- which marks a departure from widespread practice. We implement a leaderboard for realistic detector evaluations to promote research.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (86)
  1. E. Avllazagaj, Z. Zhu, L. Bilge, D. Balzarotti, and T. Dumitra\textcommabelows, “When malware changed its mind: An empirical study of variable program behaviors in the real world,” in 30th USENIX Security Symposium (USENIX Security 21), 2021, pp. 3487–3504.
  2. J. Reddick, “Us treasury: Financial institutions reported $1.2 billion in ransomware losses in 2021,” Nov 2022. [Online]. Available: https://therecord.media/us-treasury-financial-institutions-reported-1-2-billion-in-ransomware-losses-in-2021
  3. I. Group, “Malware analysis market: Global industry trends, share, size, growth, opportunity and forecast 2023-2028,” 2022. [Online]. Available: https://www.imarcgroup.com/malware-analysis-market
  4. AV-Comparatives, “Malware protection test march 2023,” 2023. [Online]. Available: https://www.av-comparatives.org/tests/malware-protection-test-march-2023
  5. E. Raff, J. Barker, J. Sylvester, R. Brandon, B. Catanzaro, and C. K. Nicholas, “Malware detection by eating a whole exe,” in Workshops at the Thirty-Second AAAI Conference on Artificial Intelligence, 2018.
  6. E. B. Karbab and M. Debbabi, “Maldy: Portable, data-driven malware detection using natural language processing and machine learning techniques on behavioral analysis reports,” Digital Investigation, vol. 28, pp. S77–S87, 2019.
  7. “Sophos - endpoint detection and response (edr),” Apr 2024. [Online]. Available: https://www.sophos.com/en-us/cybersecurity-explained/endpoint-detection-and-response
  8. VMRay, “Now, near, deep: The power of multi-layered malware analysis & detection,” 2021. [Online]. Available: https://www.vmray.com/cyber-security-blog/now-near-deep-multi-layered-malware-analysis-detection/
  9. A. Moser, C. Kruegel, and E. Kirda, “Limits of static analysis for malware detection,” in Twenty-Third Annual Computer Security Applications Conference (ACSAC 2007).   IEEE, 2007, pp. 421–430.
  10. Y. Song, M. E. Locasto, A. Stavrou, A. D. Keromytis, and S. J. Stolfo, “On the infeasibility of modeling polymorphic shellcode,” in Proceedings of the 14th ACM conference on Computer and communications security, 2007, pp. 541–551.
  11. Kaspersky, “Behavior-based protection.” [Online]. Available: https://www.kaspersky.com/enterprise-security/wiki-section/products/behavior-based-protection
  12. E. Mariconti, L. Onwuzurike, P. Andriotis, E. De Cristofaro, G. Ross, and G. Stringhini, “MAMADROID: Detecting Android Malware by Building Markov Chains of Behavioral Models,” in Annual Symposium on Network and Distributed System Security (NDSS), 2017.
  13. C. Jindal, C. Salls, H. Aghakhani, K. Long, C. Kruegel, and G. Vigna, “Neurlux: dynamic malware analysis without feature engineering,” in Proceedings of the 35th Annual Computer Security Applications Conference, 2019, pp. 444–455.
  14. D. Trizna, L. Demetrio, B. Biggio, and F. Roli, “Nebula: Self-attention for dynamic malware analysis,” 2023.
  15. “Bitdefender - sandbox analyzer,” Apr 2024. [Online]. Available: https://www.bitdefender.com/business/gravityzone-platform/sandbox-analyzer.html
  16. M. Lindorfer, C. Kolbitsch, and P. Milani Comparetti, “Detecting environment-sensitive malware,” in International Workshop on Recent Advances in Intrusion Detection.   Springer, 2011, pp. 338–357.
  17. D. Balzarotti, M. Cova, C. Karlberger, E. Kirda, C. Kruegel, and G. Vigna, “Efficient detection of split personalities in malware.” in NDSS, 2010.
  18. N. Galloro, M. Polino, M. Carminati, A. Continella, and S. Zanero, “A systematical and longitudinal study of evasive behaviors in windows malware,” Computers & Security, vol. 113, p. 102550, 2022.
  19. D. Kirat, G. Vigna, and C. Kruegel, “BareCloud: Bare-metal analysis-based evasive malware detection,” in 23rd USENIX Security Symposium (USENIX Security 14).   San Diego, CA: USENIX Association, Aug. 2014, pp. 287–301. [Online]. Available: https://www.usenix.org/conference/usenixsecurity14/technical-sessions/presentation/kirat
  20. D. Kirat and G. Vigna, “Malgene: Automatic extraction of malware analysis evasion signature,” in Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, 2015, pp. 769–780.
  21. Z. C. Schreuders, T. Shaw, M. S.-A. Khuda, G. Ravichandran, J. Keighley, and M. Ordean, “Security scenario generator (secgen): A framework for generating randomly vulnerable rich-scenario vms for learning computer security and hosting ctf events.” in ASE@ USENIX Security Symposium, 2017.
  22. N. Miramirkhani, M. P. Appini, N. Nikiforakis, and M. Polychronakis, “Spotless sandboxes: Evading malware analysis systems using wear-and-tear artifacts,” in 2017 IEEE Symposium on Security and Privacy (SP).   IEEE, 2017, pp. 1009–1024.
  23. A. Mills and P. Legg, “Investigating anti-evasion malware triggers using automated sandbox reconfiguration techniques,” Journal of Cybersecurity and Privacy, vol. 1, no. 1, pp. 19–39, 2020.
  24. P. W. Koh, S. Sagawa, H. Marklund, S. M. Xie, M. Zhang, A. Balsubramani, W. Hu, M. Yasunaga, R. L. Phillips, I. Gao et al., “Wilds: A benchmark of in-the-wild distribution shifts,” in International Conference on Machine Learning.   PMLR, 2021, pp. 5637–5664.
  25. R. Jordaney, K. Sharad, S. K. Dash, Z. Wang, D. Papini, I. Nouretdinov, and L. Cavallaro, “Transcend: Detecting concept drift in malware classification models,” in 26th USENIX Security Symposium (USENIX Security 17), 2017, pp. 625–642.
  26. F. Barbero, F. Pendlebury, F. Pierazzi, and L. Cavallaro, “Transcending transcend: Revisiting malware classification in the presence of concept drift,” in 2022 IEEE Symposium on Security and Privacy (SP).   IEEE, 2022, pp. 805–823.
  27. S. Dambra, Y. Han, S. Aonzo, P. Kotzias, A. Vitale, J. Caballero, D. Balzarotti, and L. Bilge, “Decoding the secrets of machine learning in malware classification: A deep dive into datasets, feature extraction, and model performance,” in Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security, 2023, pp. 60–74.
  28. “Malwarebazaar,” May 2024. [Online]. Available: https://bazaar.abuse.ch/
  29. A. Küchler, A. Mantovani, Y. Han, L. Bilge, and D. Balzarotti, “Does every second count? time-based evolution of malware behavior in sandboxes.” in NDSS, 2021.
  30. M. Lukasik, S. Bhojanapalli, A. Menon, and S. Kumar, “Does label smoothing mitigate label noise?” in International Conference on Machine Learning.   PMLR, 2020, pp. 6448–6458.
  31. H. Zhao, R. T. Des Combes, K. Zhang, and G. Gordon, “On learning invariant representations for domain adaptation,” in International conference on machine learning.   PMLR, 2019, pp. 7523–7532.
  32. “Malware detection in the wild leaderboard,” Apr 2024. [Online]. Available: https://malwaredetectioninthewild.github.io
  33. F. Croce, M. Andriushchenko, V. Sehwag, E. Debenedetti, N. Flammarion, M. Chiang, P. Mittal, and M. Hein, “Robustbench: a standardized adversarial robustness benchmark,” in Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2), 2021. [Online]. Available: https://openreview.net/forum?id=SSKZPJCt7B
  34. C. Willems, T. Holz, and F. Freiling, “Toward automated dynamic malware analysis using cwsandbox,” IEEE Security & Privacy, vol. 5, no. 2, pp. 32–39, 2007.
  35. U. Bayer, I. Habibi, D. Balzarotti, E. Kirda, and C. Kruegel, “A view on current malware behaviors,” in Proceedings of the 2nd USENIX Conference on Large-Scale Exploits and Emergent Threats: Botnets, Spyware, Worms, and More, ser. LEET’09.   USA: USENIX Association, 2009, p. 8.
  36. S. Liu, P. Feng, S. Wang, K. Sun, and J. Cao, “Enhancing malware analysis sandboxes with emulated user behavior,” Computers & Security, vol. 115, p. 102613, 2022.
  37. W. Huang and J. Stokes, “Mtnet: A multi-task neural network for dynamic malware classification,” in Proceedings of 13th International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment (DIMVA 2016).   Springer, July 2016, pp. 399–418. [Online]. Available: https://www.microsoft.com/en-us/research/publication/mtnet-multi-task-neural-network-dynamic-malware-classification/
  38. B. Miller, A. Kantchelian, M. C. Tschantz, S. Afroz, R. Bachwani, R. Faizullabhoy, L. Huang, V. Shankar, T. Wu, G. Yiu et al., “Reviewer integration and performance measurement for malware detection,” in Detection of Intrusions and Malware, and Vulnerability Assessment: 13th International Conference, DIMVA 2016, San Sebastián, Spain, July 7-8, 2016, Proceedings 13.   Springer, 2016, pp. 122–141.
  39. “Awesome malware analysis,” Apr 2024. [Online]. Available: https://github.com/rshipp/awesome-malware-analysis
  40. F. Pendlebury, F. Pierazzi, R. Jordaney, J. Kinder, and L. Cavallaro, “{{\{{TESSERACT}}\}}: Eliminating experimental bias in malware classification across space and time,” in 28th USENIX Security Symposium (USENIX Security 19), 2019, pp. 729–746.
  41. Y. Chen, Z. Ding, and D. Wagner, “Continuous learning for android malware detection,” in 32nd USENIX Security Symposium (USENIX Security 23).   Anaheim, CA: USENIX Association, Aug. 2023, pp. 1127–1144. [Online]. Available: https://www.usenix.org/conference/usenixsecurity23/presentation/chen-yizheng
  42. Y. Ganin and V. Lempitsky, “Unsupervised domain adaptation by backpropagation,” in International conference on machine learning.   PMLR, 2015, pp. 1180–1189.
  43. B. Gong, K. Grauman, and F. Sha, “Connecting the dots with landmarks: Discriminatively learning domain-invariant features for unsupervised domain adaptation,” in Proceedings of the 30th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, S. Dasgupta and D. McAllester, Eds., vol. 28, no. 1.   Atlanta, Georgia, USA: PMLR, 17–19 Jun 2013, pp. 222–230. [Online]. Available: https://proceedings.mlr.press/v28/gong13.html
  44. S. Sagawa, A. Raghunathan, P. W. Koh, and P. Liang, “An investigation of why overparameterization exacerbates spurious correlations,” in International Conference on Machine Learning.   PMLR, 2020, pp. 8346–8356.
  45. D. Arp, E. Quiring, F. Pendlebury, A. Warnecke, F. Pierazzi, C. Wressnegger, L. Cavallaro, and K. Rieck, “Dos and don’ts of machine learning in computer security,” in 31st USENIX Security Symposium (USENIX Security 22).   Boston, MA: USENIX Association, Aug. 2022, pp. 3971–3988. [Online]. Available: https://www.usenix.org/conference/usenixsecurity22/presentation/arp
  46. A. S. Jacobs, R. Beltiukov, W. Willinger, R. A. Ferreira, A. Gupta, and L. Z. Granville, “Ai/ml for network security: The emperor has no clothes,” in Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, 2022, pp. 1537–1551.
  47. G. Cherubin, R. Jansen, and C. Troncoso, “Online website fingerprinting: Evaluating website fingerprinting attacks on tor in the real world,” in 31st USENIX Security Symposium (USENIX Security 22), 2022, pp. 753–770.
  48. S. Das, J. Werner, M. Antonakakis, M. Polychronakis, and F. Monrose, “Sok: The challenges, pitfalls, and perils of using hardware performance counters for security,” in 2019 IEEE Symposium on Security and Privacy (SP).   IEEE, 2019, pp. 20–38.
  49. H. Aghakhani, F. Gritti, F. Mecca, M. Lindorfer, S. Ortolani, D. Balzarotti, G. Vigna, and C. Kruegel, “When malware is packin’heat; limits of machine learning classifiers based on static analysis features,” in Network and Distributed Systems Security (NDSS) Symposium 2020, 2020.
  50. “Tencent habo sandbox,” May 2024. [Online]. Available: https://habo.qq.com/
  51. “Cuckoo sandbox,” May 2024. [Online]. Available: https://github.com/cuckoosandbox/cuckoo
  52. X. Ugarte-Pedrero, M. Graziano, and D. Balzarotti, “A close look at a daily dataset of malware samples,” ACM Transactions on Privacy and Security (TOPS), vol. 22, no. 1, pp. 1–30, 2019.
  53. “Virustotal,” Apr 2024. [Online]. Available: https://www.virustotal.com
  54. M. Yong Wong, M. Landen, M. Antonakakis, D. M. Blough, E. M. Redmiles, and M. Ahamad, “An inside look into the practice of malware analysis,” in Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, 2021, pp. 3053–3069.
  55. “Broadcom - perform sandbox analysis in the cloud,” 2024. [Online]. Available: https://techdocs.broadcom.com/us/en/symantec-security-software/web-and-network-security/content-analysis/3-1/about_sandboxing/services_sandboxing_scsb.html
  56. D. Arp, M. Spreitzenbarth, M. Hubner, H. Gascon, K. Rieck, and C. Siemens, “Drebin: Effective and explainable detection of android malware in your pocket.” in Ndss, vol. 14, 2014, pp. 23–26.
  57. H. S. Anderson and P. Roth, “Ember: an open dataset for training static pe malware machine learning models,” arXiv preprint arXiv:1804.04637, 2018.
  58. R. Harang and E. M. Rudd, “Sorel-20m: A large scale benchmark dataset for malicious pe detection,” 2020.
  59. “Executable process memory analysis,” Apr 2024. [Online]. Available: https://www.hybrid-analysis.com/executable-process-memory-analysis
  60. S. Zhu, J. Shi, L. Yang, B. Qin, Z. Zhang, L. Song, and G. Wang, “Measuring and modeling the label dynamics of online Anti-Malware engines,” in 29th USENIX Security Symposium (USENIX Security 20).   USENIX Association, Aug. 2020, pp. 2361–2378. [Online]. Available: https://www.usenix.org/conference/usenixsecurity20/presentation/zhu
  61. S. Sebastián and J. Caballero, “Avclass2: Massive malware tag extraction from av labels,” in Annual Computer Security Applications Conference, 2020, pp. 42–53.
  62. N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “Smote: synthetic minority over-sampling technique,” Journal of artificial intelligence research, vol. 16, pp. 321–357, 2002.
  63. L. Yang, W. Guo, Q. Hao, A. Ciptadi, A. Ahmadzadeh, X. Xing, and G. Wang, “{{\{{CADE}}\}}: Detecting and explaining concept drift samples for security applications,” in 30th USENIX Security Symposium (USENIX Security 21), 2021, pp. 2327–2344.
  64. Y. Sun, C. Guo, and Y. Li, “React: Out-of-distribution detection with rectified activations,” Advances in Neural Information Processing Systems, vol. 34, pp. 144–157, 2021.
  65. T. H. Team, “The evolution of emotet: From banking trojan to threat distributor,” 2018. [Online]. Available: https://symantec-enterprise-blogs.security.com/blogs/threat-intelligence/evolution-emotet-trojan-distributor
  66. A. Mantovani, S. Aonzo, X. Ugarte-Pedrero, A. Merlo, and D. Balzarotti, “Prevalence and impact of low-entropy packing schemes in the malware ecosystem,” in NDSS 2020, Network and Distributed System Security Symposium, 23-26 February 2020, San Diego, CA, USA.   Internet Society, 2020.
  67. K. Lucas, S. Pai, W. Lin, L. Bauer, M. K. Reiter, and M. Sharif, “Adversarial training for Raw-Binary malware classifiers,” in 32nd USENIX Security Symposium (USENIX Security 23).   Anaheim, CA: USENIX Association, Aug. 2023, pp. 1163–1180. [Online]. Available: https://www.usenix.org/conference/usenixsecurity23/presentation/lucas
  68. “Is this a legitimate patch from microsoft?” 2021. [Online]. Available: https://answers.microsoft.com/en-us/windows/forum/all/is-this-a-legitimate-patch-from-microsoft/70fd11d4-ce77-43f6-8b09-c7c0fe3e1ba3
  69. ionstorm, “Sysmon att&ck configuration,” 2024. [Online]. Available: https://github.com/ion-storm/sysmon-config/blob/94d353f219ce3c62ae01737c0b3d758631328dfa/sysmonconfig-export.xml#L5284
  70. J. Li, Y. Wong, Q. Zhao, and M. S. Kankanhalli, “Learning to learn from noisy labeled data,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 5051–5059.
  71. X. Wu, W. Guo, J. Yan, B. Coskun, and X. Xing, “From grim reality to practical solution: Malware classification in real-world noise,” in 2023 IEEE Symposium on Security and Privacy (SP).   IEEE, 2023, pp. 2602–2619.
  72. Cisco, “Cisco secure malware analytics (threat grid),” Apr 2024. [Online]. Available: https://www.cisco.com/c/en/us/products/security/threat-grid/index.html
  73. P. Kotzias, L. Bilge, P.-A. Vervier, and J. Caballero, “Mind your own business: A longitudinal study of threats and vulnerabilities in enterprises.” in NDSS, 2019.
  74. M. Bailey, J. Oberheide, J. Andersen, Z. M. Mao, F. Jahanian, and J. Nazario, “Automated classification and analysis of internet malware,” in Recent Advances in Intrusion Detection: 10th International Symposium, RAID 2007, Gold Goast, Australia, September 5-7, 2007. Proceedings 10.   Springer, 2007, pp. 178–197.
  75. M. Pezeshki, O. Kaba, Y. Bengio, A. C. Courville, D. Precup, and G. Lajoie, “Gradient starvation: A learning proclivity in neural networks,” Advances in Neural Information Processing Systems, vol. 34, pp. 1256–1272, 2021.
  76. T. Roccia, “Evolution of malware sandbox evasion tactics – a retrospective study,” Oct 2019. [Online]. Available: https://www.mcafee.com/blogs/other-blogs/mcafee-labs/evolution-of-malware-sandbox-evasion-tactics-a-retrospective-study/
  77. P. Kirichenko, P. Izmailov, and A. G. Wilson, “Last layer re-training is sufficient for robustness to spurious correlations,” in The Eleventh International Conference on Learning Representations, 2023. [Online]. Available: https://openreview.net/forum?id=Zb6c8A-Fghk
  78. S. Purushwalkam and A. Gupta, “Demystifying contrastive self-supervised learning: Invariances, augmentations and dataset biases,” Advances in Neural Information Processing Systems, vol. 33, pp. 3407–3418, 2020.
  79. X. Chen and K. He, “Exploring simple siamese representation learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 15 750–15 758.
  80. L. Van der Maaten and G. Hinton, “Visualizing data using t-sne.” Journal of machine learning research, vol. 9, no. 11, 2008.
  81. NIST, “Trojai leaderboards,” Apr 2024. [Online]. Available: https://pages.nist.gov/trojai
  82. A. VPN, “Over 95% of all new malware threats discovered in 2022 are aimed at windows,” Nov 2022. [Online]. Available: https://atlasvpn.com/blog/over-95-of-all-new-malware-threats-discovered-in-2022-are-aimed-at-windows
  83. C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus, “Intriguing properties of neural networks,” in International Conference on Learning Representations, 2014. [Online]. Available: http://arxiv.org/abs/1312.6199
  84. K. Grosse, N. Papernot, P. Manoharan, M. Backes, and P. McDaniel, “Adversarial examples for malware detection,” in Computer Security–ESORICS 2017: 22nd European Symposium on Research in Computer Security, Oslo, Norway, September 11-15, 2017, Proceedings, Part II 22.   Springer, 2017, pp. 62–79.
  85. K. Weinberger, A. Dasgupta, J. Langford, A. Smola, and J. Attenberg, “Feature hashing for large scale multitask learning,” in Proceedings of the 26th annual international conference on machine learning, 2009, pp. 1113–1120.
  86. Y. Gorishniy, I. Rubachev, V. Khrulkov, and A. Babenko, “Revisiting deep learning models for tabular data,” Advances in Neural Information Processing Systems, vol. 34, pp. 18 932–18 943, 2021.
Citations (1)

Summary

We haven't generated a summary for this paper yet.