Semi-supervised learning via DQN for log anomaly detection (2401.03151v2)
Abstract: Log anomaly detection is a critical component in modern software system security and maintenance, serving as a crucial support and basis for system monitoring, operation, and troubleshooting. It aids operations personnel in timely identification and resolution of issues. However, current methods in log anomaly detection still face challenges such as underutilization of unlabeled data, imbalance between normal and anomaly class data, and high rates of false positives and false negatives, leading to insufficient effectiveness in anomaly recognition. In this study, we propose a semi-supervised log anomaly detection method named DQNLog, which integrates deep reinforcement learning to enhance anomaly detection performance by leveraging a small amount of labeled data and large-scale unlabeled data. To address issues of imbalanced data and insufficient labeling, we design a state transition function biased towards anomalies based on cosine similarity, aiming to capture semantic-similar anomalies rather than favoring the majority class. To enhance the model's capability in learning anomalies, we devise a joint reward function that encourages the model to utilize labeled anomalies and explore unlabeled anomalies, thereby reducing false positives and false negatives. Additionally, to prevent the model from deviating from normal trajectories due to misestimation, we introduce a regularization term in the loss function to ensure the model retains prior knowledge during updates. We evaluate DQNLog on three widely used datasets, demonstrating its ability to effectively utilize large-scale unlabeled data and achieve promising results across all experimental datasets.
- Deep reinforcement learning: A brief survey. IEEE Signal Processing Magazine 34, 6 (2017), 26–38.
- Richard Bellman. 1952. On the theory of dynamic programming. Proceedings of the national Academy of Sciences 38, 8 (1952), 716–719.
- Fingerprinting the datacenter: automated classification of performance crises. In Proceedings of the 5th European conference on Computer systems. 111–124.
- LOF: identifying density-based local outliers. In Proceedings of the 2000 ACM SIGMOD international conference on Management of data. 93–104.
- Failure diagnosis using decision trees. In International Conference on Autonomic Computing, 2004. Proceedings. IEEE, 36–43.
- Min Du and Feifei Li. 2016. Spell: Streaming parsing of system event logs. In 2016 IEEE 16th International Conference on Data Mining (ICDM). IEEE, 859–864.
- Deeplog: Anomaly detection and diagnosis from system logs through deep learning. In Proceedings of the 2017 ACM SIGSAC conference on computer and communications security. 1285–1298.
- Deep reinforcement learning for data-efficient weakly supervised business process anomaly detection. Journal of Big Data 10, 1 (2023), 33.
- Execution anomaly detection in distributed systems through unstructured log analysis. In 2009 ninth IEEE international conference on data mining. IEEE, 149–158.
- An evaluation study on log parsing and its use in log mining. In 2016 46th annual IEEE/IFIP international conference on dependable systems and networks (DSN). IEEE, 654–661.
- Drain: An online log parsing approach with fixed depth tree. In 2017 IEEE international conference on web services (ICWS). IEEE, 33–40.
- A survey on automated log analysis for reliability engineering. ACM computing surveys (CSUR) 54, 6 (2021), 1–37.
- Experience report: System log analysis for anomaly detection. In 2016 IEEE 27th international symposium on software reliability engineering (ISSRE). IEEE, 207–218.
- Loghub: A large collection of system log datasets towards automated log analytics. arXiv preprint arXiv:2008.06448 (2020).
- Towards experienced anomaly detector through reinforcement learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32.
- Fasttext. zip: Compressing text classification models. arXiv preprint arXiv:1612.03651 (2016).
- Van-Hoang Le and Hongyu Zhang. 2022. Log-based anomaly detection with deep learning: How far are we?. In Proceedings of the 44th international conference on software engineering. 1356–1367.
- Failure prediction in ibm bluegene/l event logs. In Seventh IEEE International Conference on Data Mining (ICDM 2007). IEEE, 583–588.
- Long-Ji Lin. 1992. Self-improving reactive agents based on reinforcement learning, planning and teaching. Machine learning 8 (1992), 293–321.
- Log clustering based problem identification for online service systems. In Proceedings of the 38th International Conference on Software Engineering Companion. 102–111.
- Isolation forest. In 2008 eighth ieee international conference on data mining. IEEE, 413–422.
- Mining invariants from console logs for system problem detection. In 2010 USENIX Annual Technical Conference (USENIX ATC 10).
- Clustering event logs using iterative partitioning. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. 1255–1264.
- Intelligent video anomaly detection and classification using faster RCNN with deep reinforcement learning model. Image and Vision Computing 112 (2021), 104229.
- Loganomaly: Unsupervised detection of sequential and quantitative anomalies in unstructured logs.. In IJCAI, Vol. 19. 4739–4745.
- Human-level control through deep reinforcement learning. nature 518, 7540 (2015), 529–533.
- Adam Oliner and Jon Stearley. 2007. What supercomputers say: A study of five system logs. In 37th annual IEEE/IFIP international conference on dependable systems and networks (DSN’07). IEEE, 575–584.
- Toward deep supervised anomaly detection: Reinforcement learning from partially labeled anomaly data. In Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining. 1298–1308.
- Risto Vaarandi. 2003. A data clustering algorithm for mining patterns from event logs. In Proceedings of the 3rd IEEE Workshop on IP Operations & Management (IPOM 2003)(IEEE Cat. No. 03EX764). Ieee, 119–126.
- Risto Vaarandi and Mauno Pihelgas. 2015. Logcluster-a data clustering and pattern mining algorithm for event logs. In 2015 11th International conference on network and service management (CNSM). IEEE, 1–7.
- Dueling network architectures for deep reinforcement learning. In International conference on machine learning. PMLR, 1995–2003.
- Detecting large-scale system problems by mining console logs. In Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles. 117–132.
- Semi-supervised log-based anomaly detection via probabilistic label estimation. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, 1448–1460.
- Robust log-based anomaly detection on unstable log data. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 807–817.
- Deep mutual learning. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4320–4328.
- Tools and benchmarks for automated log parsing. In 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP). IEEE, 121–130.
- Yingying He (4 papers)
- Xiaobing Pei (14 papers)