Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
133 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Real-IAD D3: A Real-World 2D/Pseudo-3D/3D Dataset for Industrial Anomaly Detection (2504.14221v1)

Published 19 Apr 2025 in cs.CV

Abstract: The increasing complexity of industrial anomaly detection (IAD) has positioned multimodal detection methods as a focal area of machine vision research. However, dedicated multimodal datasets specifically tailored for IAD remain limited. Pioneering datasets like MVTec 3D have laid essential groundwork in multimodal IAD by incorporating RGB+3D data, but still face challenges in bridging the gap with real industrial environments due to limitations in scale and resolution. To address these challenges, we introduce Real-IAD D3, a high-precision multimodal dataset that uniquely incorporates an additional pseudo3D modality generated through photometric stereo, alongside high-resolution RGB images and micrometer-level 3D point clouds. Real-IAD D3 features finer defects, diverse anomalies, and greater scale across 20 categories, providing a challenging benchmark for multimodal IAD Additionally, we introduce an effective approach that integrates RGB, point cloud, and pseudo-3D depth information to leverage the complementary strengths of each modality, enhancing detection performance. Our experiments highlight the importance of these modalities in boosting detection robustness and overall IAD performance. The dataset and code are publicly accessible for research purposes at https://realiad4ad.github.io/Real-IAD D3

Summary

Insights on the Real-IAD D3 Dataset for Industrial Anomaly Detection

The paper "Real-IAD D3: A Real-World 2D/Pseudo-3D/3D Dataset for Industrial Anomaly Detection" presents the creation and evaluation of a multimodal dataset aimed at addressing the limitations faced by existing industrial anomaly detection (IAD) datasets. With advancements in machine vision driving the need for more effective multimodal approaches, this paper introduces a high-precision dataset, Real-IAD D3, that incorporates RGB, micrometer-level 3D point clouds, and a novel pseudo-3D modality derived from photometric stereo. This contribution is poised to strengthen both methodological and practical aspects of IAD, particularly in real-world industrial settings.

Dataset Attributes and Contributions

Real-IAD D3 stands out due to its comprehensive representation of industrial objects and defects. It includes 20 categories with a substantial tally of 8,450 samples, comprising 5,000 normal samples and 3,450 anomalous ones. Each sample is meticulously synchronized across the three modalities—RGB images, pseudo-3D depth maps, and detailed 3D point clouds. The point-cloud data is captured at a resolution as fine as 0.002 mm, far exceeding the capabilities of existing datasets like MVTec 3D-AD and Real3D-AD, which are limited in resolution and diversity.

The paper also introduces the pseudo-3D modality, a significant enhancement over conventional 2D and 3D datasets. This modality effectively captures subtle surface characteristics that are sensitive to material properties, thereby enabling better pixel-level defect localization. It enhances the dataset’s utility for modeling complex industrial scenarios and provides a robust benchmark for anomaly detection algorithms.

Experimental Evaluation and Benchmarking

The authors propose a benchmark model for anomaly detection using the Real-IAD D3 dataset, termed the "D3M" framework. This model leverages the complementary features of the RGB, pseudo-3D, and 3D modalities, employing deep feature extraction techniques such as DINO for visual features and PointMAE for geometric data. The framework integrates Channel-Spatial Swapping (CSS) and contrastive learning methods to fuse these features effectively, underscoring the advancement in detection precision and reliability achievable through multimodal fusion.

Extensive experimental evaluations highlight the superiority of the Real-IAD D3 dataset in enhancing anomaly detection performance. Specifically, methods incorporating pseudo-3D data demonstrate marked improvements over single-modality and dual-modality (2D + 3D) approaches. Visualization and numerical results from the paper indicate enhanced segmentation accuracy and robustness in detecting complex and subtle anomalies, emphasizing the importance of high-resolution and diverse data sets.

Implications and Future Directions

This work provides significant implications for the development of industrial AI applications. The enriched dataset and benchmark model serve as foundational tools that can accelerate research in multimodal anomaly detection, facilitating improved defect detection in industries where safety and quality are paramount. The dataset's structure and methodological advancements encourage future exploration in integrating additional modalities and refining anomaly detection algorithms.

In terms of future developments in AI, the introduction of pseudo-3D data sets a precedent for exploring more sophisticated multimodal systems in industrial contexts. This could lead to systems capable of adapting to a broader array of materials and defect types, thus enhancing the generalizability of anomaly detection models. Moreover, expanding upon the dataset’s multimodal features may unlock new pathways for AI-driven automation and inspection processes that demand high precision and reliability.

In conclusion, Real-IAD D3 effectively bridges the gap between academic research and practical industrial applications, offering an expansive and precisely curated resource that enables researchers to push the boundaries of industrial anomaly detection.