TapTree: Process-Tree Based Host Behavior Modeling and Threat Detection Framework via Sequential Pattern Mining

Published 10 Dec 2023 in cs.CR | (2312.07575v1)

Abstract: Audit logs containing system level events are frequently used for behavior modeling as they can provide detailed insight into cyber-threat occurrences. However, mapping low-level system events in audit logs to highlevel behaviors has been a major challenge in identifying host contextual behavior for the purpose of detecting potential cyber threats. Relying on domain expert knowledge may limit its practical implementation. This paper presents TapTree, an automated process-tree based technique to extract host behavior by compiling system events' semantic information. After extracting behaviors as system generated process trees, TapTree integrates event semantics as a representation of behaviors. To further reduce pattern matching workloads for the analyst, TapTree aggregates semantically equivalent patterns and optimizes representative behaviors. In our evaluation against a recent benchmark audit log dataset (DARPA OpTC), TapTree employs tree pattern queries and sequential pattern mining techniques to deduce the semantics of connected system events, achieving high accuracy for behavior abstraction and then Advanced Persistent Threat (APT) attack detection. Moreover, we illustrate how to update the baseline model gradually online, allowing it to adapt to new log patterns over time.