CoDeGraph: Graph-based Consistent Anomaly Detection
- CoDeGraph is a graph-based methodology that detects recurring, structurally similar anomalies in zero-shot industrial settings by leveraging patch similarity metrics and novel graph analysis.
- It computes an endurance ratio from patch distance metrics to filter anomalies using the neighbor-burnout phenomenon, significantly improving anomaly classification and segmentation.
- By constructing an image-level graph and applying community detection via the Leiden algorithm, CoDeGraph robustly isolates anomaly clusters across diverse neural architectures and datasets.
Consistent-Anomaly Detection Graph (CoDeGraph) is a graph-based methodology for anomaly detection that systematically addresses the challenge posed by "consistent anomalies"—recurrent, structurally similar defects that appear across multiple samples—particularly in the zero-shot industrial anomaly detection setting. CoDeGraph introduces a novel graph-theoretic approach to anomaly filtering, moving beyond patch-level similarity analysis to a holistic, image-level community detection paradigm. By leveraging the “neighbor-burnout” phenomenon and formalizing this through metrics like the endurance ratio, CoDeGraph enables accurate and robust identification and removal of recurring anomaly artifacts that typical nearest-neighbor-based methods cannot resolve (Le-Gia et al., 12 Oct 2025).
1. Concept and Motivation
CoDeGraph was specifically designed for industrial quality control, where zero-shot anomaly detection (classification and segmentation) must operate without any labeled training data. In this domain, defects often recur with highly similar visual manifestations, resulting in “consistent anomalies.” These repeated patterns confound traditional methods that rely solely on patchwise, cross-image similarity. Existing state-of-the-art approaches—while effective for rare, stochastic anomalies—fail to detect images containing frequent, structurally similar defects, as such anomalies mimic normal behavior in the global patch similarity distribution.
CoDeGraph addresses this by reframing anomaly localization as a graph analysis problem: the relationships among image patches are embedded in a graph where dense connections reveal consistent anomalous clusters, which can then be algorithmically isolated and filtered.
2. Technical Methodology
The CoDeGraph algorithm operates in three principal stages:
- Identification of Patch Similarities: For each patch (obtained via a Vision Transformer (ViT) backbone), the algorithm computes the distance to its most similar patch in every test image . This forms an ordered mutual similarity vector for each patch.
- Isolation of Consistent Anomalies via Endurance Ratio: A novel endurance ratio metric,
is computed, where is the distance to the -th nearest neighbor, and is a reference index that is well beyond the immediate neighbor set associated with consistent matches. Patches with low endurance ratios indicate abrupt increases in distance following a short run of highly similar matches—characteristic of consistent anomalies.
- Construction and Community Detection on Image-Level Graph: An image-level graph is constructed where nodes are images, and an edge exists if images and share at least one suspicious link (as determined by low endurance ratio patches). The edge weight corresponds to the number of suspicious links. Community detection is then performed using the Leiden algorithm on the Constant Potts Model (CPM) objective:
where is the edge weight, (commonly set to the 25th percentile of edge weights) controls resolution, and indicates community assignment. After detecting consistent-anomaly communities, CoDeGraph does not eliminate all patches from flagged images but instead removes only those whose anomaly score is dominated by intra-community matches (as measured by the increment in anomaly score after exclusion of those matches).
3. The Neighbor-Burnout Phenomenon
A central empirical observation leveraged by CoDeGraph is the “neighbor-burnout” effect: for normal patches, the distance to the -th nearest neighbor patch in other images grows slowly and smoothly—typically following a power-law decay pattern when plotted in log scale. In strong contrast, a patch in a consistent anomaly finds a limited run of very close matches (due to repeated appearance across the test set) until these are exhausted, after which the next farthest match produces an abrupt, large jump in distance. This results in a “burnout” in the log growth rate:
Normal patches yield low across all , while consistent anomalies display a pronounced spike, which the endurance ratio captures by comparing near and distant neighbors.
4. Extreme Value Theory Foundation
CoDeGraph’s methodology is grounded in an extreme value theory (EVT) framework. Assuming the cross-image patch feature distance distribution possesses a heavy tail (as is typically seen with deep feature extractors), the theoretical foundation is that for normal patches, the log spacings between consecutive nearest neighbors behave as exponentially distributed variables:
with expected value , being the tail index of the distribution. As a result, the regular, slow growth for normal patches is interpretable as a high-probability EVT event, while the abrupt jump seen for consistent-anomaly patches is a statistical outlier under this model. This statistical analysis justifies the use of the endurance ratio for reliable outlier detection.
5. Empirical Results and SOTA Comparison
CoDeGraph was evaluated on multiple benchmarks, notably MVTec AD, as well as specially curated “consistent anomaly” datasets (MVTec-CA, MVTec-SynCA, ConsistAD). Using ViT-L/14-336 (CLIP) and DINOv2 backbones, CoDeGraph achieved:
- Anomaly Classification: AUROC ≈ 98.3%
- Anomaly Segmentation: F1 = 66.8% (+4.2%), AP = 68.1% (+5.4%) over state-of-the-art zero-shot methods
- Further improvement with DINOv2: F1 = 69.1% (+6.5%), AP = 71.9% (+9.2%) vs. SOTA
On consistent anomaly benchmarks, segmentation F1 is improved by up to 14.9% and AP by 18.8% over the best prior zero-shot methods. The approach also demonstrates high performance with full-shot supervision, indicating that the graph-based filtering generalizes even as image and feature distributions change.
6. Robustness with Respect to Architecture
Analysis across architectures (including ViT-L/14-336 and DINOv2) reveals that CoDeGraph’s filtering procedure is robust: it consistently provides significant performance gains regardless of the patch feature backbone. Larger receptive fields (smaller patch sizes) and higher input resolution further increase the effectiveness of the approach. The graph-based filtering is thus stable across diverse model architectures, a property critical for heterogeneous industrial inspection environments.
7. Applications and Practical Implications
CoDeGraph is designed for—and excels in—zero-shot industrial quality control settings, where large-scale, unlabeled test sets are encountered and defects are often systematic rather than stochastic. By constructing a graph structure over the test set and employing community detection to isolate consistent anomaly clusters, CoDeGraph provides a generalizable upgrade to existing zero-shot algorithms, directly addressing the previously unsolved problem of repeated anomaly patterns. Its robust empirical performance across models and quality benchmarks validates its deployment suitability in practical inspection pipelines, enabling both highly sensitive anomaly detection and accurate segmentation in the presence of recurring defects. The theoretical analysis and principled metric design further make CoDeGraph applicable as a reusable module in other zero-shot or few-shot graph-based anomaly detection tasks.