MIDAS: Microcluster Anomaly Detection

Updated 22 May 2026

MIDAS is a microcluster-based anomaly detection method that uses count–min sketches to monitor massive, evolving graph edge streams.
It applies an online two-bin chi-squared test and temporal decay to compute anomaly scores with provable upper bounds on false positives.
Empirical results on datasets like DARPA and TwitterWorldCup demonstrate up to 644× speedup and improved detection accuracy (AUC up to 0.95).

MIDAS

MIDAS is an acronym employed across diverse disciplines to name distinct tools, models, and experiments in data science, machine learning, robotics, astronomy, computer vision, and sensing. Below, major MIDAS variants are organized as independent research contributions, each defined by its core technical achievements, methodologies, and empirical impacts, as found in primary arXiv sources.

1. Microcluster-Based Anomaly Detection in Edge Streams

MIDAS refers to the “Microcluster-Based Detector of Anomalies in Edge Streams”—an online algorithm for detecting anomalous groups of edges (microclusters) in massive, temporally evolving graphs. Unlike methods that solely identify individually rare edges, MIDAS focuses on sudden bursts of similar edge activity, as in lockstep or denial-of-service attacks (Bhatia et al., 2019).

Problem Definition and Microcluster Anomalies

A microcluster is defined as a “suddenly arriving group of suspiciously similar edges,” for example, a burst of the same or related $(u,v)$ pairs within a short time tick. The detection objective is not the surprise of a single edge, but the collective deviation from historical behavior: for each $(u,v)$ (or around node $u$ / $v$ ), MIDAS tests if the edge count at the current tick greatly exceeds the expectation under the historic per-tick rate.

Algorithmic Structure

Data Structures: MIDAS maintains two Count–Min Sketches (CMS): one for the total counts $s_{uv}$ over all ticks, and one for the current-tick counts $a_{uv}$ at time $t$ . All CMSs operate with $O(1)$ time per edge (constant space per edge), with width $b$ , depth $w$ fixed.
Online Anomaly Score: The system executes a streaming, two-bin chi-squared test. At time $(u,v)$ 0 for $(u,v)$ 1, the expected count is $(u,v)$ 2. The score is

$(u,v)$ 3

where $(u,v)$ 4, $(u,v)$ 5 are CMS estimates.

Extension – MIDAS-R: Adds temporal decay ( $(u,v)$ 6-discounting between ticks) and computes per-node CMS and scores for spatial relations, assigning to each edge the maximum anomaly score over edge and endpoint nodes.

Theoretical Guarantees

MIDAS provides a provable upper bound $(u,v)$ 7 on the probability of false positives. For bias-corrected count $(u,v)$ 8, the adjusted test statistic $(u,v)$ 9 satisfies

$u$ 0

where $u$ 1 is the corresponding quantile of the chi-squared distribution.

Empirical Results

Evaluations on real network and event-stream datasets, including DARPA Intrusion, TwitterSecurity, and TwitterWorldCup, demonstrate the following:

Dataset	AUC (SedanSpot)	AUC (MIDAS)	Time (SedanSpot, s)	Time (MIDAS, s)	AUC Gain	Speedup
DARPA	0.64	0.91	84	0.13	+42%	644×
DARPA (MIDAS-R)	-	0.95	-	0.39	+48%	215×
TwitterWorldCup	-	-	27.58	0.06	-	460×

Average precision improvements closely match AUC gains. MIDAS-R’s anomaly

Markdown Report Issue Upgrade to Chat

References (1)

MIDAS: Microcluster-Based Detector of Anomalies in Edge Streams (2019)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to MIDAS.

MIDAS: Microcluster Anomaly Detection

1. Microcluster-Based Anomaly Detection in Edge Streams

Problem Definition and Microcluster Anomalies

Algorithmic Structure

Theoretical Guarantees

Empirical Results

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

MIDAS: Microcluster Anomaly Detection

1. Microcluster-Based Anomaly Detection in Edge Streams

Problem Definition and Microcluster Anomalies

Algorithmic Structure

Theoretical Guarantees

Empirical Results

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research