Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 39 tok/s Pro
GPT-5 High 27 tok/s Pro
GPT-4o 118 tok/s Pro
Kimi K2 181 tok/s Pro
GPT OSS 120B 429 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Argos: Agentic Time-Series Anomaly Detection with Autonomous Rule Generation via Large Language Models (2501.14170v1)

Published 24 Jan 2025 in cs.LG, cs.DC, and cs.MA

Abstract: Observability in cloud infrastructure is critical for service providers, driving the widespread adoption of anomaly detection systems for monitoring metrics. However, existing systems often struggle to simultaneously achieve explainability, reproducibility, and autonomy, which are three indispensable properties for production use. We introduce Argos, an agentic system for detecting time-series anomalies in cloud infrastructure by leveraging LLMs. Argos proposes to use explainable and reproducible anomaly rules as intermediate representation and employs LLMs to autonomously generate such rules. The system will efficiently train error-free and accuracy-guaranteed anomaly rules through multiple collaborative agents and deploy the trained rules for low-cost online anomaly detection. Through evaluation results, we demonstrate that Argos outperforms state-of-the-art methods, increasing $F_1$ scores by up to $9.5\%$ and $28.3\%$ on public anomaly detection datasets and an internal dataset collected from Microsoft, respectively.

Summary

  • The paper introduces a novel agent-based anomaly detection system that employs LLMs for autonomous rule generation on time-series data.
  • It leverages a multi-stage pipeline with Detection, Repair, and Review Agents to ensure explainable, reproducible, and accurate rule generation.
  • Evaluation on public and internal datasets shows significant F1 score improvements, demonstrating enhanced performance over traditional methods.

Argos: Agentic Time-Series Anomaly Detection with Autonomous Rule Generation via LLMs

Introduction

The paper introduces Argos, an innovative agentic system for anomaly detection in time-series data within cloud infrastructures, employing LLMs for autonomous rule generation. Argos is designed to enhance anomaly detection systems by ensuring explainability, reproducibility, and autonomy, which are often not simultaneously achieved by existing approaches.

System Design and Architecture

Argos leverages a multi-stage design, comprising data preprocessing, rule training, and deployment phases. The key components of Argos are:

  1. Data Preprocessor: Scales, index, and tokenizes input data for efficient processing within the context of time-series anomaly detection.
  2. Training Engine: Implements an agent-based pipeline with Detection, Repair, and Review Agents, ensuring the generation of syntactically correct and accurate anomaly detection rules.

- Detection Agent: Proposes rules in Python based on input data. - Repair Agent: Corrects syntax errors in proposed rules. - Review Agent: Evaluates and iterates rules to improve accuracy.

  1. Deployment Components: Include an Anomaly Detector and Aggregator, combining outputs from both base detectors and LLM-generated rules to ensure accuracy and resource efficiency. Figure 1

    Figure 1: The overall design of Argos.

Autonomous Rule Generation

Argos distinguishes itself through autonomous rule generation via LLMs. The Detection Agent generates executable Python code for anomaly detection rules, bridging the gap between domain-specific expertise and machine-generated logic. Existing LLM techniques are integrated to ensure rules that are both explainable and reproducible, while maintaining the adaptability of the system to varying anomaly patterns.

Correctness and Accuracy

Argos employs iterative feedback loops between the Repair and Review Agents to improve rule accuracy and correctness. This approach is inspired by backpropagation, ensuring the continuous improvement of anomaly detection rules through systematic error correction and performance evaluation.

Model Fusion for Accuracy Guarantee

The model fusion strategy in Argos combines the strengths of LLM-generated rules and existing well-tuned anomaly detectors to guarantee accuracy improvements. This ensures that new, autonomously generated rules not only match but often exceed the performance of traditional models.

Evaluation

Argos was evaluated on public datasets such as KPI and Yahoo, as well as an internal Microsoft dataset. The results show a significant improvement in F1F_1 scores compared to state-of-the-art methods, with up to a 9.5-point increase on public datasets and a 28.3-point increase on internal datasets. These evaluations underscore Argos' effectiveness in addressing the challenges of time-series anomaly detection. Figure 2

Figure 2: Comparison of the correctness rate and average test F1 score of the Training Engine with only the Detection Agent versus full Training Engine with Repair and Review Agents.

Conclusion

Argos represents a substantial advancement in time-series anomaly detection, effectively addressing the triad of explainability, reproducibility, and autonomy. Through the autonomous generation of detection rules via LLMs, Argos provides an efficient, adaptable, and robust solution for anomaly detection in cloud infrastructures. The system's design ensures higher accuracy and efficiency, making it a valuable tool for enhancing the reliability of cloud services. Future directions may focus on expanding Argos’ applications to other domains and integrating more sophisticated model fusion techniques to further improve its performance.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 1 tweet and received 2 likes.

Upgrade to Pro to view all of the tweets about this paper: