Gotta catch 'em all: a Multistage Framework for honeypot fingerprinting

Published 22 Sep 2021 in cs.CR | (2109.10652v1)

Abstract: Honeypots are decoy systems that lure attackers by presenting them with a seemingly vulnerable system. They provide an early detection mechanism as well as a method for learning how adversaries work and think. However, over the last years, a number of researchers have shown methods for fingerprinting honeypots. This significantly decreases the value of a honeypot; if an attacker is able to recognize the existence of such a system, they can evade it. In this article, we revisit the honeypot identification field, by providing a holistic framework that includes state of the art and novel fingerprinting components. We decrease the probability of false positives by proposing a rigid multi-step approach for labeling a system as a honeypot. We perform extensive scans covering 2.9 billion addresses of the IPv4 space and identify a total of 21,855 honeypot instances. Moreover, we present a number of interesting side-findings such as the identification of more than 354,431 non-honeypot systems that represent potentially vulnerable servers (e.g. SSH servers with default password configurations and vulnerable versions). Lastly, we discuss countermeasures against honeypot fingerprinting techniques.

Abstract PDF Upgrade to Chat

Citations (10)

View on Semantic Scholar

Summary

The paper presents a comprehensive multistage framework integrating active and passive fingerprinting methods to accurately detect honeypot instances.
It describes an active probe-based pipeline with sequential checks such as portscan, banner verification, and SSL certificate analysis.
The framework outperforms earlier methods by detecting a broader range of honeypots with significantly lower false positives, validated over 21,855 instances.

Multistage Framework for Honeypot Fingerprinting

This paper introduces a comprehensive multistage framework for honeypot fingerprinting. Honeypots are decoy systems designed to mimic real systems to lure attackers and gain insights into their methods. Detection and evasion of honeypots significantly reduce their potential value to network security. Therefore, fingerprinting honeypots is both a technical challenge and a necessity for evaluating the robustness of such systems.

Framework Overview

The proposed framework integrates both active and passive fingerprinting techniques. Active techniques involve direct interaction with the system using crafted probes, while passive techniques analyze available data without direct interaction, often leveraging third-party data sources like Shodan and Censys for metascan-based methods.

Active Probe-based Pipeline

This pipeline consists of several sequential checks, including:

Portscan: Identifying open ports associated with honeypot services.
Banner Check: Comparing service banners with known honeypot signatures.
Static HTTP Response: Checking for default static content returned by HTTP services, indicative of honeypots.
SSL/TLS Certificate Check: Analyzing certificate attributes for known honeypot default certificates.
Protocol Handshake: Identifying anomalous protocol negotiation behaviors.
Library Dependency Check: Detecting honeypots using obsolete or specific libraries.
Static Command Response: Probing for known static command responses.

The robustness of the framework ensures minimal false positives, with multiple checks confirming an instance as a honeypot only after all relevant stages are validated.

Figure 1: Multistage Framework for Honeypot Fingerprinting

Metascan-based Pipeline

The metascan-based pipeline leverages existing large-scale internet scans from platforms like Shodan and Censys. This strategy involves passive checks such as:

Keyword Search: Using known honeypot identifiers within Shodan/Censys datasets.
ISP and AS Check: Identifying systems linked to known honeypot-friendly ISPs or research organizations.
Cloud Hosting Assessment: Determining the likelihood of cloud-hosted honeypots, particularly critical for ICS honeypots which are realistically non-cloud deployable.
FQDN Check: Establishes whether a domain name is associated with the honeypot instance, rare due to potential security and PR risks.

Evaluation and Results

Extensive internet-wide scanning over six months identified 21,855 honeypot instances across several common honeypot implementations. The framework revealed that metascan approaches, while useful, are less comprehensive than the active probing techniques.

Figure 2: Honeypots detected per scan

Figure 3: Honeypots detected by type and technique

Comparison to State of the Art

The framework achieved a broader detection range and lower false positives than previous studies. Unlike single-method approaches, the multistage checks made a significant difference in accuracy and reliability of identifying honeypots.

Figure 4: Comparison to previous measurements in related work

Discussion

The framework's thoroughness underscores the inadequacy of current honeypot defenses against fingerprinting, with most honeypots delivered with default configurations. Moreover, comparisons with Shodan's Honeyscore demonstrated limited effectiveness in detecting diverse honeypots, signifying the need for improved fingerprinting defenses.

Figure 5: Comparison with Shodan's Honeyscore

Conclusion

The proposed multistage framework offers a robust solution for honeypot fingerprinting. Future developments are encouraged to focus on dynamic, self-aware honeypots that can adapt to attacks, leveraging Moving Target Defense (MTD) techniques. Improved deployment and configuration practices are essential for honeypot viability as a significant cybersecurity tool.