Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Target before Shooting: Accurate Anomaly Detection and Localization under One Millisecond via Cascade Patch Retrieval (2308.06748v1)

Published 13 Aug 2023 in cs.CV

Abstract: In this work, by re-examining the "matching" nature of Anomaly Detection (AD), we propose a new AD framework that simultaneously enjoys new records of AD accuracy and dramatically high running speed. In this framework, the anomaly detection problem is solved via a cascade patch retrieval procedure that retrieves the nearest neighbors for each test image patch in a coarse-to-fine fashion. Given a test sample, the top-K most similar training images are first selected based on a robust histogram matching process. Secondly, the nearest neighbor of each test patch is retrieved over the similar geometrical locations on those "global nearest neighbors", by using a carefully trained local metric. Finally, the anomaly score of each test image patch is calculated based on the distance to its "local nearest neighbor" and the "non-background" probability. The proposed method is termed "Cascade Patch Retrieval" (CPR) in this work. Different from the conventional patch-matching-based AD algorithms, CPR selects proper "targets" (reference images and locations) before "shooting" (patch-matching). On the well-acknowledged MVTec AD, BTAD and MVTec-3D AD datasets, the proposed algorithm consistently outperforms all the comparing SOTA methods by remarkable margins, measured by various AD metrics. Furthermore, CPR is extremely efficient. It runs at the speed of 113 FPS with the standard setting while its simplified version only requires less than 1 ms to process an image at the cost of a trivial accuracy drop. The code of CPR is available at https://github.com/flyinghu123/CPR.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Hanxi Li (15 papers)
  2. Jianfei Hu (1 paper)
  3. Bo Li (1108 papers)
  4. Hao Chen (1007 papers)
  5. Yongbin Zheng (3 papers)
  6. Chunhua Shen (404 papers)
Citations (9)

Summary

  • The paper proposes a novel cascade patch retrieval framework that enhances anomaly detection accuracy and speed by integrating global and local retrieval stages.
  • It employs a coarse-to-fine strategy with contrastive loss-based metric learning to refine patch matching and reduce false positives in cluttered scenes.
  • The method achieves real-time performance at over 1000 FPS on standard datasets, outperforming state-of-the-art models and enabling rapid quality inspection and surveillance.

Analyzing "Target before Shooting: Accurate Anomaly Detection and Localization under One Millisecond via Cascade Patch Retrieval"

The paper presents a novel framework for anomaly detection (AD) termed "Cascade Patch Retrieval" (CPR), which advances both the accuracy and efficiency of AD processes. The CPR framework is characterized by a cascade patch retrieval procedure that adopts a coarse-to-fine strategy. This strategy allows the algorithm to achieve real-time operational performance—processing speeds exceeding 1000 frames per second (FPS) with only a negligible drop in accuracy.

Methodological Contributions

The CPR algorithm is built upon several key components:

  1. Global and Local Retrieval Stages: Consistent with the coarse-to-fine approach, the method performs a global retrieval step to filter candidate reference images that share geometric similarities with the test image. This step ensures that the subsequent local retrieval operates on a more focused set of patches, significantly enhancing both efficiency and accuracy.
  2. Cascade Patch Retrieval Strategy: By acquiring a robust reference set prior to patch matching, CPR maximizes retrieval accuracy and minimizes computation time. The introduction of a probabilistic foreground estimation step serves to refine the anomaly scores, mitigating the false-positive predictions typical of cluttered backgrounds.
  3. Metric Learning with Contrastive Loss: The local patch retrieval process hinges on carefully trained metric features, optimized using a contrastive loss approach. This ensures that the retrieval of local features adheres closely to relevant geometric contexts, thus improving the detection reliability of object parts pertinent to anomaly localization.

Experimental Design and Findings

The paper rigorously benchmarks CPR against several state-of-the-art (SOTA) anomaly detection models across three widely recognized datasets: MVTec AD, MVTec-3D AD, and BTAD. Evaluation metrics include AP, PRO, and Pixel-AUC for localization, alongside Image-AUC for detection. CPR not only surpasses existing methods in accuracy but also establishes new performance records on these datasets.

Remarkably, CPR achieves an unprecedented level of efficiency, running at an exceptional rate of over 1000 FPS—significantly outpacing other competition models—while preserving a high level of accuracy. This performance is particularly emphasized in settings where the algorithm is streamlined for efficiency using TensorRT optimizations.

Implications and Future Directions

From a practical perspective, CPR offers industries reliant on real-time quality inspection a robust tool capable of accommodating the challenging requirements of fast-paced environments without compromising accuracy. The two-phase retrieval strategy, integrating both global image alignment and local patch discrimination, sets a precedent for future AD models aiming at real-time applications. The proposed model can be expanded into domains requiring swift decision-making processes, such as autonomous vehicles and surveillance systems.

Theoretically, the proposal to integrate a learning-based foreground segmentation network opens pathways to incorporate semantic understanding within a rapid AD paradigm. Future research may explore integrating more sophisticated foreground-background differentiation methods or leveraging synthetic data augmentations to further strengthen the robustness and generalization capabilities of anomaly detection frameworks.

Through the introduction of CPR, this paper advances the field of anomaly detection by striking an effective balance between methodological simplicity and high-performance speed, setting a new benchmark for both academia and industry applications.