LogFormer: A Pre-train and Tuning Pipeline for Log Anomaly Detection

Published 9 Jan 2024 in cs.LG, cs.AI, and cs.SE | (2401.04749v1)

Abstract: Log anomaly detection is a key component in the field of artificial intelligence for IT operations (AIOps). Considering log data of variant domains, retraining the whole network for unknown domains is inefficient in real industrial scenarios. However, previous deep models merely focused on extracting the semantics of log sequences in the same domain, leading to poor generalization on multi-domain logs. To alleviate this issue, we propose a unified Transformer-based framework for Log anomaly detection (LogFormer) to improve the generalization ability across different domains, where we establish a two-stage process including the pre-training and adapter-based tuning stage. Specifically, our model is first pre-trained on the source domain to obtain shared semantic knowledge of log data. Then, we transfer such knowledge to the target domain via shared parameters. Besides, the Log-Attention module is proposed to supplement the information ignored by the log-paring. The proposed method is evaluated on three public and one real-world datasets. Experimental results on multiple benchmarks demonstrate the effectiveness of our LogFormer with fewer trainable parameters and lower training costs.

Abstract PDF HTML Upgrade to Chat

References (31)

Citations (10)

View on Semantic Scholar

Summary

The paper proposes a two-stage framework that pre-trains a Transformer-based model on source log data and adapts it using lightweight adapters for improved anomaly detection.
It introduces a novel Log-Attention module that preserves critical semantic information lost during log parsing, enhancing detection accuracy.
Experimental results on public and real-world datasets show that LogFormer achieves state-of-the-art performance with fewer trainable parameters and lower training costs.

LogFormer: Pre-training and Tuning Pipeline for Log Anomaly Detection

This paper introduces LogFormer, a two-stage framework for log anomaly detection designed to improve generalization across different domains. The core idea is to pre-train a Transformer-based model on a source domain to capture shared semantic knowledge of log data, and then adapt this knowledge to a target domain using adapter-based tuning. The authors also introduce a Log-Attention module to supplement information lost during log parsing. The proposed method is evaluated on three public datasets and one real-world dataset, demonstrating its effectiveness with fewer trainable parameters and lower training costs.

Addressing Log Anomaly Detection Challenges

Log anomaly detection is crucial for monitoring data peculiarities in large-scale IT systems. Traditional methods struggle with the increasing volume of log data and the semantic complexity of log messages. Existing deep learning methods often rely on log parsing to extract templates, which can lead to a loss of semantic information. Furthermore, these methods typically focus on single-domain logs, limiting their ability to generalize to new domains or accommodate continuous iteration of log data. LogFormer addresses these challenges by preserving shared semantic knowledge between different domains and avoiding information loss through a novel Log-Attention module. The paper highlights the shared semantic space across different domains (Figure 1), which motivates the pre-training approach.

Figure 1: The same anomaly from multiple domains.

LogFormer Architecture and Methodology

LogFormer's architecture consists of two main stages: pre-training and adapter-based tuning (Figure 2). The pre-training stage involves training a Transformer-based model with a Log-Attention module on a source domain to acquire common semantic knowledge from log sequences. The Log-Attention module is designed to incorporate information from parameters that are typically discarded during log parsing. In the adapter-based tuning stage, the pre-trained model is adapted to the target domain by adding lightweight adapters to the encoder layers. Only the parameters of the adapters are updated during this stage, while the parameters of the pre-trained model are frozen, enabling efficient knowledge transfer with minimal training costs.

Key Components: Log-Attention and Adapter-Based Tuning

The Log-Attention module is a key innovation in LogFormer, designed to address the information loss caused by log parsing. After parsing, the module encodes the parameters of each log sequence using a linear layer, and then assigns a learnable scalar to each output, which serves as a bias term in self-attention (Figure 3). This allows the model to aggregate both keywords and parameters information, improving its ability to detect anomalies.

Figure 2: Logs and Templates.

(Figure 3)

Figure 3: Log-Attention.

Adapter-based tuning is another important aspect of LogFormer, enabling efficient knowledge transfer from the source domain to the target domain. Adapters are inserted parallel to the Log-Attention layer and feedforward layer (Figure 4). This design allows the adapter to use input information better with original complete encoders. By updating only the parameters of the adapters during target domain adaptation, LogFormer significantly reduces the number of trainable parameters and lowers training costs compared to fine-tuning the entire model.

Figure 4: Encoder with Adapters.

Experimental Results and Analysis

The authors conducted extensive experiments on three public datasets (HDFS, BGL, and Thunderbird) and one real-world dataset (GAIA) to evaluate the performance of LogFormer. The results demonstrate that LogFormer achieves state-of-the-art performance on all three public benchmark datasets, with fewer trainable parameters and lower training costs than existing methods. Ablation studies were performed to assess the impact of pre-training, adapter-based tuning, and the Log-Attention module. The results confirm the effectiveness of each component and highlight the benefits of the two-stage training approach. Specifically, fine-tuning converges faster than training from scratch, which shows the learned knowledge from the source domain is valuable. Also, LogFormer generates a little higher $F_1$ score (1 $\%$ on average) than directly fine-tuning the pre-trained model on two datasets.

Impact of Pre-training and Low-Resource Performance

The paper includes an analysis of the impact of pre-training on the convergence speed and performance of LogFormer. The results show that pre-training accelerates convergence and improves the initial performance of the model, demonstrating the value of transferring knowledge from the source domain. The performance of LogFormer under low-resource settings, with fewer than 20k training examples, was also assessed. The results show that LogFormer provides acceptable results in the low-resource setting, which is highly parameter-efficient for log analysis. The loss curves in the training process were compared (Figure 5), showing the faster convergence achieved through pre-training.

Figure 5: Loss and $F_1$ score on the test set.

Practical Application and Generalization

LogFormer has been successfully applied to a cloud service company, demonstrating its practical utility in real-world scenarios. The results on the GAIA dataset, a real-world distributed dataset, show that LogFormer achieves the best performance compared to other baselines, highlighting its ability to generalize to complex, multi-domain, and continuously evolved data. The model has been running stably for over 3000 hours on this system, further demonstrating its robustness and reliability.

Conclusion

LogFormer presents a novel and effective approach to log anomaly detection, addressing the challenges of generalization and information loss in existing methods. The two-stage pre-training and adapter-based tuning pipeline, combined with the Log-Attention module, enables LogFormer to achieve state-of-the-art performance with fewer trainable parameters and lower training costs. The experimental results and practical application demonstrate the potential of LogFormer for real-world deployment in large-scale IT systems. Further research could explore different adapter architectures, pre-training objectives, and applications to other anomaly detection tasks.

Markdown

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

off on

Knowledge Gaps

off on

Glossary

off on

Practical Applications

off on

Conceptual Simplification

off on

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Generate Now

LogFormer: A Pre-train and Tuning Pipeline for Log Anomaly Detection

Summary

LogFormer: Pre-training and Tuning Pipeline for Log Anomaly Detection

Addressing Log Anomaly Detection Challenges

LogFormer Architecture and Methodology

Key Components: Log-Attention and Adapter-Based Tuning

Experimental Results and Analysis

Impact of Pre-training and Low-Resource Performance

Practical Application and Generalization

Conclusion

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Open Problems

Continue Learning

Authors (10)

Collections

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research

LogFormer: A Pre-train and Tuning Pipeline for Log Anomaly Detection

Summary

LogFormer: Pre-training and Tuning Pipeline for Log Anomaly Detection

Addressing Log Anomaly Detection Challenges

LogFormer Architecture and Methodology

Key Components: Log-Attention and Adapter-Based Tuning

Experimental Results and Analysis

Impact of Pre-training and Low-Resource Performance

Practical Application and Generalization

Conclusion

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Open Problems

Continue Learning

Related Papers

Authors (10)

Collections

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research