AI ATAC 1: An Evaluation of Prominent Commercial Malware Detectors (2308.14835v1)
Abstract: This work presents an evaluation of six prominent commercial endpoint malware detectors, a network malware detector, and a file-conviction algorithm from a cyber technology vendor. The evaluation was administered as the first of the Artificial Intelligence Applications to Autonomous Cybersecurity (AI ATAC) prize challenges, funded by / completed in service of the US Navy. The experiment employed 100K files (50/50% benign/malicious) with a stratified distribution of file types, including ~1K zero-day program executables (increasing experiment size two orders of magnitude over previous work). We present an evaluation process of delivering a file to a fresh virtual machine donning the detection technology, waiting 90s to allow static detection, then executing the file and waiting another period for dynamic detection; this allows greater fidelity in the observational data than previous experiments, in particular, resource and time-to-detection statistics. To execute all 800K trials (100K files $\times$ 8 tools), a software framework is designed to choreographed the experiment into a completely automated, time-synced, and reproducible workflow with substantial parallelization. A cost-benefit model was configured to integrate the tools' recall, precision, time to detection, and resource requirements into a single comparable quantity by simulating costs of use. This provides a ranking methodology for cyber competitions and a lens through which to reason about the varied statistical viewpoints of the results. These statistical and cost-model results provide insights on state of commercial malware detection.
- Robert A. Bridges (34 papers)
- Brian Weber (26 papers)
- Justin M. Beaver (3 papers)
- Jared M. Smith (5 papers)
- Miki E. Verma (7 papers)
- Savannah Norem (3 papers)
- Kevin Spakes (2 papers)
- Cory Watson (4 papers)
- Jeff A. Nichols (3 papers)
- Brian Jewell (3 papers)
- Chelsey Dunivan Stahl (1 paper)
- Kelly M. T. Huffer (3 papers)
- T. Sean Oesch (1 paper)
- Michael. D. Iannacone (1 paper)