Multi-Criteria Defect Detection
- Multi-Criteria Defect Detection is a method that integrates structural, logical, and appearance cues to identify and localize diverse defects in engineered systems.
- It employs integrated pipelines combining RGB-D sensing, semantic scene decomposition, and multimodal fusion to achieve robust defect detection, including zero-shot scenarios.
- Benchmarking shows improved metrics such as IoU and F1 scores, with multimodal fusion reducing false positives by up to 30% compared to single-criterion approaches.
Multi-criteria defect detection refers to the principled identification, localization, and quantification of diverse defect types within manufactured products or engineered systems, exploiting multiple, orthogonal criteria (structural, logical, appearance-based, and more) in unified frameworks. These approaches systematically extend beyond uni-criterion detection (e.g., solely geometric or texture-based) by integrating heterogeneous cues, enabling more robust, generalizable, and interpretable defect inspection under both supervised and zero-shot paradigms across complex real-world environments.
1. Formal Taxonomies and Annotation Strategies
Central to multi-criteria defect detection is the rigorous formalization of defect taxonomies with hierarchical annotation standards. A representative system organizes defects as follows (Araya-Martinez et al., 28 Nov 2025):
- Pose annotation: 6D pose (translation , rotation ) per BOP standard.
- Structural defects: pixel-wise masks (e.g., cracks, deformations, warping, dents).
- Logical defects: polygons annotated for existence (missing part), position (misalignment), type (incorrect subtype), surface properties (color/material mismatches).
The top-level taxonomy dichotomizes defects into:
| Super-category | Sub-categories |
|---|---|
| Structural | Deformation, Cracks, Dents, Warping, Impact marks, ... |
| Logical | Existence, Position, Type (qty/size/match), Color/Material |
This extensible annotation scheme, compatible with COCO/BOP conventions, allows unified benchmarking and composite evaluation across heterogeneous defect morphologies and causal origins.
2. Integrated Methodological Pipelines
Multi-criteria detection systems synthesize vision, geometry, and logic domains, employing the following core components (Araya-Martinez et al., 28 Nov 2025, Dey et al., 2022, Rachuri et al., 23 Dec 2024):
- Sensing and Preprocessing: Inputs may include RGB-D images, depth maps, time-series NDE signals (IE, USW), or multi-modal sensor arrays.
- Semantic Scene Decomposition: Object detection and CAD-based pose estimation (e.g., YOLO v8 + FoundationPose → ICP), enabling semantic digital twinning for reference generation (Araya-Martinez et al., 28 Nov 2025).
- Defect Criteria Extraction:
- Depth/Geometric deviations: .
- Color/Appearance deviations: In CIELAB, .
- Structural anomalies: Detected via mask-prediction or morphological post-processing.
- Logical anomalies: Detected by topological mismatch, existence/absence, or misassembly.
- Scoring and Thresholding: Formation of defect masks (thresholding /) and computation of standard metrics (intersection-over-union, mean IoU).
- Multimodal Fusion: In complex SHM domains, alpha-shape geospatial fusion of multivariate anomaly point clouds integrates NDE modalities with contour-aligned image features (Rachuri et al., 23 Dec 2024).
This unified logic-structure-appearance pipeline enables the detection of both known and previously unseen defect types, including highly variable logical and geometric faults.
3. Zero-Shot and Low-Data Generalization
A major advance in multi-criteria defect detection is robust zero-shot generalization (Araya-Martinez et al., 28 Nov 2025, Sadikaj et al., 9 Apr 2025). These frameworks:
- Require no defect-specific training; only object detection and (optionally) pose modules are learned.
- Construct on-the-fly “zero-defect” references using scene graph + digital twins.
- Detect arbitrary defect modalities as scene- or object-level deviations from idealized CAD-based expectations, including new cracks, surface anomalies, or logical inconsistencies (e.g., missing inserts), minimizing retraining costs.
- Empirically achieve up to 63.3% IoU against ground-truth masks under semi-controlled industrial conditions using simple per-pixel distance metrics (Araya-Martinez et al., 28 Nov 2025).
Zero-shot prompt-based architectures (e.g., MultiADS (Sadikaj et al., 9 Apr 2025)) use cross-modal alignment of rich defect-centric text prompts to CLIP-based patch features, further extending multi-type anomaly segmentation and multi-label detection without explicit training on defect exemplars.
4. Benchmarking, Metrics, and Quantitative Results
Evaluation protocols for multi-criteria detection emphasize both instance-level and per-criterion breakouts:
- Structural and logical defect detection: Mean IoU up to 63.3% for existence anomalies and 62.9% for color anomalies in RGB-D digital twin comparisons under semi-controlled conditions (Araya-Martinez et al., 28 Nov 2025).
- Multimodal SHM: F1 rises from 0.71–0.75 (single-modality) to 0.83 with multimodal fusion and contour-based cross-verification, reducing false positives by 30% (Rachuri et al., 23 Dec 2024).
- Instance-based deep architectures: Mask R-CNN yields mAP ≈0.936 ([email protected]) across fine defect categories; YOLOv5-based weld inspection reaches [email protected] = 98.7% across eight defect types (Dey et al., 2022, Yang et al., 2021).
- Cross-domain and zero-shot: MultiADS attains pixel-level AUROC ≥95% and competitive macro-F1 across five industrial datasets, outperforming previous zero-/few-shot baselines in multi-type segmentation (Sadikaj et al., 9 Apr 2025).
Evaluation protocols typically include IoU, precision/recall, mAP, pixel-level AUROC/AUPRO, and scene-wide anomaly F1, spanning all represented defect criteria.
5. Representative Architectures and Methodological Innovations
Recent frameworks integrate hierarchical annotation and multi-criterion detection with:
- Differentiable, per-defect mask and feature alignment (Mask R-CNN, SCM-MRCNN with channel/spatial attention (Yu et al., 6 Feb 2024)).
- Digital twin-driven scene simulation: On-the-fly CAD rendering for reference generation against which real-world deviations are scored (Araya-Martinez et al., 28 Nov 2025).
- Multi-modal fusion and cross-verification: Alpha-shape geospatial fusion and contour-based validation (NDE + vision) to reduce ambiguity (Rachuri et al., 23 Dec 2024).
- Zero-shot, multi-type segmentation: Patch-to-prompt cosine similarity over CLIP embeddings, with per-type mask extraction and prompt-based extensibility (Sadikaj et al., 9 Apr 2025).
- Hybrid statistical-ML fusions: Exploiting Fisher-separation and statistical feature selection atop deep or classical detectors for noise-robustness (Menéndez, 11 Dec 2024).
- RL-based multi-criteria exploration: Tunable multi-objective reinforcement learning reward design for Trojan and rare fault discovery in complex circuits (Sarihi et al., 2023).
These diverse architectures address both factory-side high-throughput inspection and field-side asset health monitoring/maintenance, efficiently bridging varied modalities and defect taxonomies.
6. Limitations, Open Challenges, and Future Directions
Despite substantial gains, current multi-criteria defect detection systems face several technical challenges (Araya-Martinez et al., 28 Nov 2025):
- Sensing limitations: Depth sensor noise, adverse lighting, non-ideal surfaces degrade geometric comparison fidelity.
- Alignment robustness: ICP-based pose refinement requires good initial estimates; clutter and occlusion compromise matching.
- Appearance metric limitations: Basic Euclidean color metrics in LAB may yield false positives; more sophisticated, perceptually calibrated metrics are needed.
- Annotation extensibility: Hierarchical taxonomies must expand to capture new logical/functional criteria; e.g., complex assembly constraints or dynamic connectivity.
- Dynamic and temporal domains: Moving/deformable parts require temporal fusion and real-time scene tracking.
- Active view planning: Integration with robot-guided sensor positioning can address occlusion-induced coverage gaps.
Future work is oriented toward deep learned comparison functions (e.g., semantic consistency), temporal and active inspection strategies, domain transfer by CAD-model loading, and expansion to further structured, logical, or physical defect classes across manufacturing and asset health applications (Araya-Martinez et al., 28 Nov 2025).
By formalizing and unifying criteria across geometric, logical, appearance-based, and semantic domains, and embedding these into scalable, high-throughput pipelines, multi-criteria defect detection approaches establish a common substrate for robust, interpretable, and extensible visual quality inspection in complex industrial and infrastructure settings (Araya-Martinez et al., 28 Nov 2025, Dey et al., 2022, Sadikaj et al., 9 Apr 2025, Rachuri et al., 23 Dec 2024).