- The paper demonstrates that using less common programming languages and compilers can drastically lower malware detection rates, as evidenced by experiments on 100 unique samples.
- The study implements two standard malicious payloads in 39 languages with 50 compiler/packager variants to highlight significant binary structural differences.
- The results reveal that language-specific binary characteristics obscure malicious intent, complicating static analysis and urging enhanced detection methods.
This paper, titled "Coding Malware in Fancy Programming Languages for Fun and Profit" (2503.19058), investigates how malware authors can evade traditional static analysis detection and complicate reverse engineering by simply choosing less common programming languages and compilers. The authors conducted a data-driven paper and targeted experiments to explore this phenomenon, answering three key research questions:
- How does the programming language and compiler choice impact the malware detection rate?
- What is the root cause of this disparity in detection?
- What are the benefits for an attacker beyond detection rate?
The motivation for the paper stemmed from observed trends in real-world malware datasets (Malware Bazaar and an APT dataset). Analysis of these datasets revealed a shift towards more diverse programming languages and compilers by malware authors over time. Crucially, this shift correlated with significant deviations in detection rates, where less common language/compiler combinations often resulted in lower detection rates, even for well-known malware families.
To systematically investigate this, the authors designed an experiment where they implemented two simple, well-known malicious payloads:
- Payload I: A Powershell reverse shell initiated via a system command.
- Payload II: Shellcode execution using standard Windows API calls (
VirtualAlloc
, RtlMoveMemory
, CreateThread
).
These payloads were written in 39 different programming languages and compiled/packaged using 50 different compilers or packagers, resulting in 100 unique executable samples. The authors deliberately avoided additional obfuscation techniques to isolate the effect of the language/compiler itself.
The samples were submitted to VirusTotal to assess their detection rates (RQ1). The results showed a significant variance:
- For Payload I, 13 samples had zero detections, and 19 had very low detection rates (< 5 engines), resulting in an overall low detection rate.
- For Payload II, 2 samples had zero detections, and 11 had very low detection rates, with generic signatures often being the only flags.
These findings demonstrated that simple payloads, typically easily detectable, could evade a significant number of antivirus engines just by being implemented in certain less common languages or using specific compilers.
To understand the root cause of this disparity (RQ2), the authors analyzed the structural characteristics of the generated binaries and performed static analysis using tools like capa
. Key metrics examined included the number of sections, threads, loaded DLLs, function counts, binary size, shellcode fragmentation, and control flow complexity.
- Binary Structure: Samples performing the same task varied dramatically in size, section count, thread count, and especially the number of functions (ranging from 6 to over 80,000 functions). This structural difference impacts signature-based detection.
- Static Analysis (capa): While
capa
could identify malicious capabilities in many samples, it sometimes failed to do so for the samples that evaded detection on VirusTotal, indicating that the language/compiler-specific structure hid the functionality from generic capability signatures.
- Shellcode Fragmentation: A pattern matching analysis showed that in languages like C/C++, the shellcode often remained sequential or had predictable gaps, making pattern matching effective. However, in languages like Rust, Phix, Lisp, and Haskell, the shellcode bytes were heavily fragmented and dispersed unpredictably throughout the binary or stored in non-obvious ways (e.g., pushed byte-by-byte onto the stack in Phix, dynamically assembled in Haskell), rendering static pattern matching ineffective.
- Reverse Engineering Metrics: Metrics like cyclomatic complexity, unique basic blocks/instructions executed, and indirect calls/jumps were measured via execution traces. Languages with substantial runtimes (Java, Go, Haskell, etc.) introduced significant complexity (high function counts, many indirect branches), even for simple payloads. The authors presented a case paper on the Haskell shellcode sample, illustrating how the Glasgow Haskell Compiler (GHC) runtime's use of a separate STG-machine, lazy evaluation, and continuation-passing style leads to complex, non-linear control flow involving thousands of runtime instructions and indirect jumps, making traditional disassembly and static analysis exceptionally difficult compared to the straightforward linear code generated by a C compiler for the same task. This complexity hides the malicious logic.
The paper also discussed benefits for attackers beyond just static analysis evasion (RQ3). These include:
- Cross-Compilation: Languages like Go offer easy cross-compilation for multiple operating systems and architectures (x86, x64, ARM, MIPS, etc.). This allows malware authors to target a broader range of devices, including IoT, with less development effort.
- Language Features: Modern languages offer features like memory safety (Rust), garbage collection, and rich standard libraries that can simplify malware development while potentially introducing structures that challenge analysis tools designed for C/C++.
- Reduced Tooling Needs: Statically compiled languages (like Go by default) reduce runtime dependencies, simplifying deployment.
- Assistance from LLMs: LLMs can facilitate code translation between languages, lowering the barrier for attackers to implement malware components in languages they are not familiar with.
The authors conclude that while traditional C/C++ compiled with Microsoft's toolchain remains prevalent, the choice of language and compiler provides malware authors with an effective layer of obfuscation. The root cause lies in the fundamental differences in how compilers generate binaries, handle data, represent functions, manage memory, and implement runtime environments. This results in structural characteristics like function bloat, code fragmentation, and complex indirect control flow that render signature-based and traditional static analysis tools less effective. Ignoring malware written in less common languages creates a "hideout" that APTs and M-a-a-S operators are already exploiting. Therefore, the paper stresses the need for deeper analysis of binaries from a wider range of languages and compilers to develop more robust detection methods and improve reverse engineering tools.