Experience Report: Deep Learning-based System Log Analysis for Anomaly Detection (2107.05908v2)

Published 13 Jul 2021 in cs.SE and cs.LG

Abstract: Logs have been an imperative resource to ensure the reliability and continuity of many software systems, especially large-scale distributed systems. They faithfully record runtime information to facilitate system troubleshooting and behavior understanding. Due to the large scale and complexity of modern software systems, the volume of logs has reached an unprecedented level. Consequently, for log-based anomaly detection, conventional manual inspection methods or even traditional machine learning-based methods become impractical, which serve as a catalyst for the rapid development of deep learning-based solutions. However, there is currently a lack of rigorous comparison among the representative log-based anomaly detectors that resort to neural networks. Moreover, the re-implementation process demands non-trivial efforts, and bias can be easily introduced. To better understand the characteristics of different anomaly detectors, in this paper, we provide a comprehensive review and evaluation of five popular neural networks used by six state-of-the-art methods. Particularly, four of the selected methods are unsupervised, and the remaining two are supervised. These methods are evaluated with two publicly available log datasets, which contain nearly 16 million log messages and 0.4 million anomaly instances in total. We believe our work can serve as a basis in this field and contribute to future academic research and industrial applications.

View on arXiv

Authors (5)

Zhuangbin Chen (26 papers)
Jinyang Liu (51 papers)
Wenwei Gu (10 papers)
Yuxin Su (37 papers)
Michael R. Lyu (176 papers)

Citations (90)

View on Semantic Scholar

Summary

Evaluating Deep Learning-Based Log Anomaly Detection

The paper "Experience Report: Deep Learning-based System Log Analysis for Anomaly Detection" presents a comprehensive analysis and comparison of several deep learning (DL) models for log-based anomaly detection, addressing existing gaps between academic research and industrial practices in this area. Due to the unprecedented scale and complexity of modern software systems, traditional manual and machine learning approaches are no longer practical for anomaly detection, thus emphasizing the necessity of advanced DL techniques.

Overview

The authors focus on evaluating five representative neural networks implemented through six state-of-the-art methods in the field. The paper involves four unsupervised methods (DeepLog, LogAnomaly, Logsy, Autoencoder) and two supervised methods (LogRobust, CNN). These models are tested on two publicly available log datasets from Hadoop Distributed File System (HDFS) and BlueGene/L supercomputer, comprising nearly 16 million log messages and 0.4 million anomaly instances. The primary aspects evaluated are accuracy, robustness, and efficiency, revealing significant insights into the challenges and advantages of DL models in the real-world application of anomaly detection.

Numerical Results

In terms of accuracy, supervised methods generally exhibit superior performance over unsupervised ones, attributed to their ability to leverage labeled data for training. On the HDFS dataset, the Decision Tree method from traditional machine learning approaches obtains an outstanding F1 score of 0.998, highlighting its potential in environments with similar data characteristics. However, the DL methods outperform traditional ones in robustness against unseen logs, which frequently occur due to the evolving nature of software systems. Incorporating log semantics substantially improves the models' accuracy and robustness, especially under scenarios with unexpected log events.

The efficiency of the models is gauged by their training and testing times, with DL methods generally requiring more computational resources compared to traditional approaches. Nevertheless, certain traditional ML methods such as SVM and PCA showcase exceptional efficiency, suggesting that depending on the application context, simpler models might be preferable when computational resources are constrained.

Practical Implications

The implications of this research extend beyond theoretical understanding to practical applications in industrial settings. The authors highlight challenges encountered when deploying DL-based anomaly detection systems in production at Huawei Cloud. These challenges include managing the complexity of log data in large-scale systems, handling data with potentially low-quality and variability, the requirement for threshold re-determination due to environmental changes, and dealing with concept drift as systems evolve.

Future Developments

The paper suggests several areas for future development, including the refinement of logging practices to enhance log data quality, which is critical for effective anomaly detection. The authors advocate for closer collaboration among engineering teams to improve data generation and utilization processes. Moreover, advancements in model capabilities such as online learning, incorporating human knowledge, and multi-source learning are identified as promising directions to address current limitations in DL-based log anomaly detection.

Conclusion

This paper serves as an essential reference for both researchers and practitioners interested in implementing and improving DL techniques for log-based anomaly detection. By offering a detailed analysis of current methodologies and providing an open-source toolkit, the paper lays a foundation for further exploration and practical application of DL models in detecting anomalies within complex, large-scale software systems.

PDF Markdown

Related Papers

Find Related Papers

GitHub

GitHub - logpai/deep-loglizer: A deep learning toolkit for log-based anomaly detection (244 stars)