Time-Series Anomaly Detection Service at Microsoft

Published 10 Jun 2019 in cs.LG and stat.ML | (1906.03821v1)

Abstract: Large companies need to monitor various metrics (for example, Page Views and Revenue) of their applications and services in real time. At Microsoft, we develop a time-series anomaly detection service which helps customers to monitor the time-series continuously and alert for potential incidents on time. In this paper, we introduce the pipeline and algorithm of our anomaly detection service, which is designed to be accurate, efficient and general. The pipeline consists of three major modules, including data ingestion, experimentation platform and online compute. To tackle the problem of time-series anomaly detection, we propose a novel algorithm based on Spectral Residual (SR) and Convolutional Neural Network (CNN). Our work is the first attempt to borrow the SR model from visual saliency detection domain to time-series anomaly detection. Moreover, we innovatively combine SR and CNN together to improve the performance of SR model. Our approach achieves superior experimental results compared with state-of-the-art baselines on both public datasets and Microsoft production data.

Abstract PDF Upgrade to Chat

Citations (439)

View on Semantic Scholar

Summary

The paper introduces a novel SR-CNN hybrid model that transforms time-series data to reveal anomalies without labeled training data.
It achieves remarkable performance improvements, including a 36.1% increase in F1-score over leading unsupervised methods on benchmark datasets.
The approach is practical for monitoring millions of metrics in real-time across Microsoft services and sets the stage for future ensemble learning advancements.

Time-Series Anomaly Detection Service at Microsoft

The paper authored by Ren et al. discusses the development and implementation of a time-series anomaly detection service at Microsoft. Focused on enabling the real-time monitoring of various metrics for large-scale applications, this service is crucial for identifying potential issues that could impact system operations and user experience. The research presents a novel anomaly detection algorithm combining Spectral Residual (SR), traditionally used in visual saliency detection, with Convolutional Neural Network (CNN) architectures. This hybrid approach advances the state-of-the-art in unsupervised anomaly detection, particularly in scenarios where labeled training data is absent.

Methodology

The core methodology merges the SR technique, which effectively highlights salient parts in data similar to visual saliency detection, with CNNs to refine anomaly detection performance. The SR model transforms time-series data into a spectral domain where the anomaly points become more apparent. While SR alone uses a threshold to detect anomalies, the incorporation of CNNs replaces this simplistic thresholding with a more complex decision boundary, allowing for finer detection capabilities based solely on synthetic anomaly labels. This setup remains unsupervised but gains substantial refinements from the synthetic training data, distinguishing itself from the fully supervised models that require labeled datasets.

Experimental Results

The authors rigorously evaluate their approach on multiple datasets, including both external open-source collections (KPI and Yahoo) and internal Microsoft production data. The results reveal that the SR-CNN model significantly outperforms existing unsupervised methods, with improvements noticeable across precision, recall, and $F_1$ -scores. For example, on the KPI dataset, the new method shows a 36.1% increase in $F_1$ -score over the best baseline unsupervised approach. The computational efficiency is also noteworthy, as the SR approach demonstrates rapid execution times, critical for online anomaly detection in large-scale scenarios.

Practical and Theoretical Implications

Practically, the proposed service can handle millions of time-series, an essential feature given the volume of data managed by services like Bing, Office 365, and Azure. Theoretically, the application of visual saliency algorithms to time-series data opens new pathways for cross-domain learning applications. This approach may inspire similar applications in fields where traditional and computer vision techniques might be synergistically combined.

Future Directions

The paper suggests ensemble learning as a future direction. By combining multiple advanced models, anomaly detection systems can become more robust, handling diverse patterns and variations more effectively. Additionally, the deployment of this technology as part of Microsoft's Cognitive Services on Azure indicates broader applicability and commercial potential. This future work could lead to improved automated systems capable of managing unforeseen anomalies in real-time, thereby enhancing operational efficiency across various industries.

In summary, this research contributes significantly to the field of unsupervised anomaly detection in time-series data by demonstrating a practical system with theoretical underpinnings enriched by cross-domain insights. The approach not only serves current Microsoft products but also bears potential for broader adoption in real-time data monitoring and anomaly detection contexts.

Markdown