Learning to Evade Static PE Machine Learning Malware Models via Reinforcement Learning (1801.08917v2)

Published 26 Jan 2018 in cs.CR

Abstract: Machine learning is a popular approach to signatureless malware detection because it can generalize to never-before-seen malware families and polymorphic strains. This has resulted in its practical use for either primary detection engines or for supplementary heuristic detection by anti-malware vendors. Recent work in adversarial machine learning has shown that deep learning models are susceptible to gradient-based attacks, whereas non-differentiable models that report a score can be attacked by genetic algorithms that aim to systematically reduce the score. We propose a more general framework based on reinforcement learning (RL) for attacking static portable executable (PE) anti-malware engines. The general framework does not require a differentiable model nor does it require the engine to produce a score. Instead, an RL agent is equipped with a set of functionality-preserving operations that it may perform on the PE file. Through a series of games played against the anti-malware engine, it learns which sequences of operations are likely to result in evading the detector for any given malware sample. This enables completely black-box attacks against static PE anti-malware, and produces functional evasive malware samples as a direct result. We show in experiments that our method can attack a gradient-boosted machine learning model with evasion rates that are substantial and appear to be strongly dependent on the dataset. We demonstrate that attacks against this model appear to also evade components of publicly hosted antivirus engines. Adversarial training results are also presented: by retraining the model on evasive ransomware samples, a subsequent attack is 33% less effective. However, there are overfitting dangers when adversarial training, which we note. We release code to allow researchers to reproduce and improve this approach.

Authors (5)

Hyrum S. Anderson (10 papers)
Anant Kharkar (6 papers)
Bobby Filar (8 papers)
David Evans (63 papers)
Phil Roth (3 papers)

Citations (198)

View on Semantic Scholar

Summary

The paper presents an RL-based framework that systematically modifies PE files to achieve a 24% evasion rate on holdout samples.
The methodology employs an actor-critic model with experience replay in a custom OpenAI gym environment to simulate realistic adversarial attacks.
Adversarial retraining reduced evasion efficacy by 33%, highlighting the potential for developing more robust malware detection strategies.

Analyzing Malware Evasion through Reinforcement Learning

The research paper, "Learning to Evade Static PE Machine Learning Malware Models via Reinforcement Learning," explores a significant challenge in cybersecurity: the susceptibility of ML antivirus models to evasion techniques. The authors introduce a novel framework that leverages reinforcement learning (RL) to conduct black-box attacks on static portable executable (PE) malware detection models. This approach gains significance because it circumvents the traditional requirement of knowing the internals of the machine learning model, such as gradients in differentiable models, or even the scores that non-differentiable models report.

Framework and Methodology

The proposed RL-based attack framework stands out for its reliance on manipulations of the PE file that do not alter its intended functionality. The RL agent, trained using an actor-critic model with experience replay, uses these modifications to learn sequences of actions that help malware evade detection. The paper emphasizes that this approach enables the creation of evasion methods in a way that emulates real-world adversaries—capable of systematic probing without insights into the model’s parameters or architecture.

The framework employs a custom OpenAI gym environment to facilitate simulating multiple iterations of the RL training process. The gym interacts with static PE files, modifies them using a structured action space, and assesses their evasion efficacy through a binary classifier. The state of the PE files is represented through a comprehensive feature vector, capturing a holistic view of the malware, ensuring that the environment provides actionable feedback to the RL agent. This vector incorporates PE header features, byte histograms, and other features used commonly in malware classification models.

Results Interpretation

In the experimental evaluation, the RL framework demonstrated potential with notable evasion rates across various datasets, including generic malware, ransomware, Virut, and BrowseFox adware. Key results include:

An evasion rate of 24% on a holdout set of VirusShare samples, revealing the method's ability to generalize beyond training data.
A reduction in the median detection ratio by anti-virus engines on VirusTotal indicates that the modified samples successfully evaded several real-world static analysis models, highlighting potential cross-evasion implications.

Significantly, adversarial training using the evasive samples revealed a 33% decrease in evasion efficacy in subsequent attacks, illustrating the potential to harden models using evasion-resistant datasets.

Limitations and Practical Challenges

There are several notable challenges and limitations associated with this approach. The efficacy of the RL-based evasion is highly contingent on the available mutation strategies that are PE format-compliant and on maintaining functionality without embedding execution faults. The paper found instances where modified PE files did not maintain their original functionality due to unanticipated parsing errors induced by manipulation. This indicates potential reliability issues under certain obfuscation tricks or when manipulating less conventional sections within the PE structure.

Furthermore, adversarial retraining can inadvertently induce biases in the model that are not representative of malicious or benign characteristics but instead reflect discrepancies introduced by modification tools. Therefore, while adversarial retraining seemed to bolster the model’s resilience against RL-generated adversarial samples, it highlights the need for caution to avoid overfitting to peculiarly modified samples that do not exhibit typical malware behavior.

Future Implications

The findings from this research present notable theoretical and practical implications for anti-malware defenses. Practically, security vendors can leverage such frameworks to proactively identify vulnerabilities in their detectors and refine model robustness proactively. Theoretically, it opens avenues for further exploration into the intersection of RL and cybersecurity, particularly in devising more sophisticated manipulation strategies and action spaces, as well as enhancing actor-critic algorithms to better cope with high-stakes and high-dimensional evasion scenarios.

Overall, the paper underscores the evolving landscape of adversarial techniques and emphasizes the continuous need for adaptive defense strategies against sophisticated attack methods in cybersecurity.

PDF Markdown