- The paper proposes a novel detection method that leverages anomaly analysis of per-domain DNS traffic features to capture both high-throughput tunneling and low-throughput exfiltration malware.
- It utilizes statistical metrics like character entropy, non-IP record ratios, and unique query volume combined with a one-class Isolation Forest to differentiate malicious from benign domains.
- By testing on large-scale DNS logs with real malware samples, the study demonstrates high detection rates while noting limitations such as the reliance on dedicated malicious domains for exfiltration.
Data exfiltration over the Domain Name System (DNS) protocol is a significant security threat. While the DNS protocol was designed for name resolution, its widespread availability and minimal monitoring in many environments make it an attractive covert channel for attackers. Existing research has primarily focused on detecting high-throughput DNS tunneling tools, often used to bypass network restrictions for general internet access. However, a distinct class of threats involves low-throughput malware specifically designed to exfiltrate small amounts of sensitive data (like credit card numbers or credentials) slowly over DNS, often going undetected by methods optimized for high-volume tunneling.
This paper proposes a practical method for detecting both high-throughput DNS tunneling and low-throughput DNS exfiltration malware by identifying anomalous query patterns at the primary domain level. The core idea is that malicious domains used for exfiltration, whether tunneling or malware, exhibit unique characteristics in their DNS query traffic compared to legitimate domains. Furthermore, these malicious operations often rely on domains specifically registered for the attack rather than compromising existing legitimate domains. This observation forms the basis for a detection and blocking strategy centered on the primary domain.
The proposed method operates as a constantly running process designed for integration with or alongside existing DNS infrastructure that supports traffic logging and domain blacklisting (see Figure 1). It involves three main phases:
- Data Collection: Streaming DNS traffic logs (containing query name, response, and record type) are processed and grouped by their primary domain. To detect "low and slow" attacks that unfold over hours, logs for each domain are collected over a sliding window spanning the last ns×λ minutes, where λ is the data collection frequency (e.g., 15 or 60 minutes) and ns is the window size in λ intervals (e.g., 6 or 24). This allows the system to observe query behavior over extended periods.
- Feature Extraction: Every λ minutes, for each domain with sufficient traffic within the sliding window, a feature vector is computed. These features capture statistical properties indicative of data exchange or encoding:
- Character Entropy (E): Measures the randomness of characters in subdomain names, helping detect encoded payloads.
- Non-IP RR Type Ratio (NI): The proportion of query types that are not A or AAAA (standard IP address lookups), such as TXT or NULL records, which are often used in tunneling for larger data transfer.
- Unique Query Ratio (Uniq): The proportion of unique subdomains queried for a primary domain, expected to be high for data exfiltration where each query might encode unique data bits.
- Unique Query Volume (Vol): The absolute number of unique subdomains queried, which can be high for rapid data exfiltration or simply due to the non-repeating nature of exfiltration queries.
- Query Length Average (Len): The average length of the full query names, as encoded data often results in longer subdomains.
- Longest Meaningful Word Ratio (LMW): The ratio of the longest English dictionary word found in a subdomain label to the total subdomain length. This feature helps distinguish between randomly generated/encoded subdomains and human-readable or legitimate ones.
These features are chosen to be statistical in nature, making the resulting feature vectors less dependent on the overall traffic volume of the DNS server, thus aiding model portability.
- Anomaly Detection: An Isolation Forest model, a one-class classifier, is used to identify domains with anomalous feature vectors. The model is trained beforehand on a dataset of known benign DNS traffic. During operation, each computed feature vector for a domain is passed to the trained model, which assigns an anomaly score. If the score exceeds a pre-defined threshold Ts (derived from the training data based on an acceptable false positive rate ν), the domain is classified as anomalous.
- Blocking: Domains classified as anomalous are immediately marked for blocking. In a real-world deployment, legitimate services that might use DNS for data exchange (e.g., some security software updates or lookups) would be initially flagged and manually white-listed by a security expert. After this initial white-listing phase, any new domains detected as anomalous are considered malicious and blocked indefinitely by the DNS server (e.g., by returning NXDOMAIN or no response). This per-domain blocking strategy is practical because, as observed, malware often uses dedicated malicious domains.
Practical implementation requires careful consideration of parameters like λ and ns, which influence detection latency and the ability to catch very slow attacks, balanced against computational and storage resources. The acceptable false positive rate ν is also critical, trading off detection sensitivity against the volume of domains requiring manual review. The evaluation demonstrates that even with a very low ν (2×10−5), the method achieves high detection rates.
The method was evaluated on a large-scale recursive DNS server's logs, peaking at 47 million requests per hour. This dataset was injected with traffic from:
- High-throughput tunneling tools: Iodine and Dns2tcp.
- Low-throughput exfiltration malware: FrameworkPOS (exfiltrating credit cards) and Backdoor.Win32.Denis (Trojan communication).
The results show that the proposed method successfully detected all four test subjects. In contrast, two recent methods optimized for tunneling detection (RoboPol: The optical polarization of gamma-ray--loud and gamma-ray--quiet blazars, 2016, Rooted forests that avoid sets of permutations, 2016) were re-evaluated on the same dataset and found to be ineffective at detecting the low-throughput malware scenarios, highlighting the novel contribution of this work. The false positive analysis demonstrated that while some legitimate services using DNS for data exchange were initially flagged, the rate of new false positives decreased significantly over time, indicating that an initial white-listing phase is effective.
A key limitation acknowledged by the authors is the assumption that malware uses a single, dedicated domain for exfiltration. While this holds true for the evaluated malware samples, future attacks could potentially distribute exfiltration across multiple legitimate, compromised domains, requiring more sophisticated detection methods that correlate activity across domains or users. Addressing this is proposed for future work.
In summary, the paper presents a practical, scalable anomaly detection system for DNS data exfiltration that moves beyond just detecting high-throughput tunneling. By focusing on per-primary-domain characteristics observed over an extended time window and leveraging a one-class classifier, it effectively identifies both tunneling and low-throughput malware, enabling immediate and automated blocking of malicious domains in real-world DNS environments.