Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Novel Approach to Industrial Defect Generation through Blended Latent Diffusion Model with Online Adaptation (2402.19330v2)

Published 29 Feb 2024 in cs.CV and cs.MM

Abstract: Effectively addressing the challenge of industrial Anomaly Detection (AD) necessitates an ample supply of defective samples, a constraint often hindered by their scarcity in industrial contexts. This paper introduces a novel algorithm designed to augment defective samples, thereby enhancing AD performance. The proposed method tailors the blended latent diffusion model for defect sample generation, employing a diffusion model to generate defective samples in the latent space. A feature editing process, controlled by a ``trimap" mask and text prompts, refines the generated samples. The image generation inference process is structured into three stages: a free diffusion stage, an editing diffusion stage, and an online decoder adaptation stage. This sophisticated inference strategy yields high-quality synthetic defective samples with diverse pattern variations, leading to significantly improved AD accuracies based on the augmented training set. Specifically, on the widely recognized MVTec AD dataset, the proposed method elevates the state-of-the-art (SOTA) performance of AD with augmented data by 1.5%, 1.9%, and 3.1% for AD metrics AP, IAP, and IAP90, respectively. The implementation code of this work can be found at the GitHub repository https://github.com/GrandpaXun242/AdaBLDM.git

Definition Search Book Streamline Icon: https://streamlinehq.com
References (62)
  1. H. M. Schlüter, J. Tan, B. Hou, and B. Kainz, “Natural synthetic anomalies for self-supervised anomaly detection and localization,” in Eur. Conf. Comput. Vis., S. Avidan, G. Brostow, M. Cissé, G. M. Farinella, and T. Hassner, Eds.   Cham: Springer Nature Switzerland, 2022, pp. 474–489.
  2. T. Defard, A. Setkov, A. Loesch, and R. Audigier, “Padim: a patch distribution modeling framework for anomaly detection and localization,” in International Conference on Pattern Recognition.   Springer, 2021, pp. 475–489.
  3. V. Zavrtanik, M. Kristan, and D. Skočaj, “DrÆm – a discriminatively trained reconstruction embedding for surface anomaly detection,” in Int. Conf. Comput. Vis., 2021, pp. 8310–8319.
  4. K. Roth, L. Pemula, J. Zepeda, B. Schölkopf, T. Brox, and P. Gehler, “Towards total recall in industrial anomaly detection,” in IEEE Conf. Comput. Vis. Pattern Recog., June 2022, pp. 14 318–14 328.
  5. M. Yang, P. Wu, and H. Feng, “Memseg: A semi-supervised method for image surface defect detection using differences and commonalities,” Engineering Applications of Artificial Intelligence, vol. 119, p. 105835, 2023.
  6. X. Zhang, S. Li, X. Li, P. Huang, J. Shan, and T. Chen, “Destseg: Segmentation guided denoising student-teacher for anomaly detection,” in IEEE Conf. Comput. Vis. Pattern Recog., 2023, pp. 3914–3923.
  7. W. Liu, H. Chang, B. Ma, S. Shan, and X. Chen, “Diversity-measurable anomaly detection,” in IEEE Conf. Comput. Vis. Pattern Recog., 2023, pp. 12 147–12 156.
  8. B. Scholkopf, R. Williamson, A. Smola, J. Shawe-Taylor, J. Platt et al., “Support vector method for novelty detection,” Adv. Neural Inform. Process. Syst., vol. 12, no. 3, pp. 582–588, 2000.
  9. V. Chandola, A. Banerjee, and V. Kumar, “Anomaly detection: A survey,” ACM computing surveys (CSUR), vol. 41, no. 3, pp. 1–58, 2009.
  10. C. Huang, Z. Kang, and H. Wu, “A prototype-based neural network for image anomaly detection and localization,” arXiv preprint arXiv:2310.02576, 2023.
  11. H. Li, J. Hu, B. Li, H. Chen, Y. Zheng, and C. Shen, “Target before shooting: Accurate anomaly detection and localization under one millisecond via cascade patch retrieval,” arXiv preprint arXiv:2308.06748, 2023.
  12. H. Li, J. Wu, H. Chen, M. Wang, and C. Shen, “Efficient anomaly detection with budget annotation using semi-supervised residual transformer,” arXiv preprint arXiv:2306.03492, 2023.
  13. J. Wei, F. Shen, C. Lv, Z. Zhang, F. Zhang, and H. Yang, “Diversified and multi-class controllable industrial defect synthesis for data augmentation and transfer,” in IEEE Conf. Comput. Vis. Pattern Recog. Worksh., 2023, pp. 4444–4452.
  14. Y. Duan, Y. Hong, L. Niu, and L. Zhang, “Few-shot defect image generation via defect-aware feature manipulation,” in AAAI, vol. 37, no. 1, 2023, pp. 571–578.
  15. J. Sohl-Dickstein, E. Weiss, N. Maheswaranathan, and S. Ganguli, “Deep unsupervised learning using nonequilibrium thermodynamics,” in AAAI Conference on Artificial Intelligence.   PMLR, 2015, pp. 2256–2265.
  16. R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High-resolution image synthesis with latent diffusion models,” in IEEE Conf. Comput. Vis. Pattern Recog., 2022, pp. 10 684–10 695.
  17. O. Avrahami, D. Lischinski, and O. Fried, “Blended diffusion for text-driven editing of natural images,” in IEEE Conf. Comput. Vis. Pattern Recog., 2022, pp. 18 208–18 218.
  18. O. Avrahami, O. Fried, and D. Lischinski, “Blended latent diffusion,” ACM Transactions on Graphics, vol. 42, no. 4, pp. 1–11, 2023.
  19. P. Bergmann, M. Fauser, D. Sattlegger, and C. Steger, “Mvtec ad–a comprehensive real-world dataset for unsupervised anomaly detection,” in IEEE Conf. Comput. Vis. Pattern Recog., 2019, pp. 9592–9600.
  20. T. DeVries and G. W. Taylor, “Improved regularization of convolutional neural networks with cutout,” arXiv preprint arXiv:1708.04552, 2017.
  21. C.-L. Li, K. Sohn, J. Yoon, and T. Pfister, “Cutpaste: Self-supervised learning for anomaly detection and localization,” in IEEE Conf. Comput. Vis. Pattern Recog., 2021, pp. 9664–9674.
  22. D. Lin, Y. Cao, W. Zhu, and Y. Li, “Few-shot defect segmentation leveraging abundant defect-free training samples through normal background regularization and crop-and-paste operation,” in Int. Conf. Multimedia and Expo, 2021, pp. 1–6.
  23. H. Zhang, Z. Wu, Z. Wang, Z. Chen, and Y.-G. Jiang, “Prototypical residual networks for anomaly detection and localization,” in IEEE Conf. Comput. Vis. Pattern Recog., 2023, pp. 16 281–16 291.
  24. J. Niu, Q. Yu, S. Dong, Z. Wang, K. Dang, and xiaowei ding, “Resynthdetect: A fundus anomaly detection network with reconstruction and synthetic features,” in Brit. Mach. Vis. Conf.   BMVA, 2023. [Online]. Available: https://papers.bmvc2023.org/0099.pdf
  25. W. Xia, Y. Zhang, Y. Yang, J.-H. Xue, B. Zhou, and M.-H. Yang, “Gan inversion: A survey,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 3, pp. 3121–3138, 2022.
  26. L. Yang, Z. Zhang, Y. Song, S. Hong, R. Xu, Y. Zhao, W. Zhang, B. Cui, and M.-H. Yang, “Diffusion models: A comprehensive survey of methods and applications,” ACM Computing Surveys, vol. 56, no. 4, pp. 1–39, 2023.
  27. S. Niu, B. Li, X. Wang, and H. Lin, “Defect image sample generation with gan for improving defect recognition,” IEEE Transactions on Automation Science and Engineering, vol. 17, no. 3, pp. 1611–1622, 2020.
  28. G. Zhang, K. Cui, T.-Y. Hung, and S. Lu, “Defect-gan: High-fidelity defect synthesis for automated defect inspection,” in IEEE/CVF Winter Conference on Applications of Computer Vision, 2021, pp. 2523–2533.
  29. T. Karras, S. Laine, M. Aittala, J. Hellsten, J. Lehtinen, and T. Aila, “Analyzing and improving the image quality of stylegan,” IEEE Conf. Comput. Vis. Pattern Recog., pp. 8107–8116, 2019. [Online]. Available: https://api.semanticscholar.org/CorpusID:209202273
  30. I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” Advances in neural information processing systems, vol. 27, 2014.
  31. D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” arXiv preprint arXiv:1312.6114, 2013.
  32. L. Dinh, D. Krueger, and Y. Bengio, “Nice: Non-linear independent components estimation,” arXiv preprint arXiv:1410.8516, 2014.
  33. J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” Advances in neural information processing systems, vol. 33, pp. 6840–6851, 2020.
  34. J. Song, C. Meng, and S. Ermon, “Denoising diffusion implicit models,” arXiv preprint arXiv:2010.02502, 2020.
  35. P. Dhariwal and A. Nichol, “Diffusion models beat gans on image synthesis,” Advances in neural information processing systems, vol. 34, pp. 8780–8794, 2021.
  36. A. Ramesh, P. Dhariwal, A. Nichol, C. Chu, and M. Chen, “Hierarchical text-conditional image generation with clip latents,” arXiv preprint arXiv:2204.06125, vol. 1, no. 2, p. 3, 2022.
  37. R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High-resolution image synthesis with latent diffusion models,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 10 684–10 695.
  38. C. Saharia, W. Chan, S. Saxena, L. Li, J. Whang, E. L. Denton, K. Ghasemipour, R. Gontijo Lopes, B. Karagol Ayan, T. Salimans et al., “Photorealistic text-to-image diffusion models with deep language understanding,” Advances in Neural Information Processing Systems, vol. 35, pp. 36 479–36 494, 2022.
  39. A. Lugmayr, M. Danelljan, A. Romero, F. Yu, R. Timofte, and L. Van Gool, “Repaint: Inpainting using denoising diffusion probabilistic models,” in IEEE Conf. Comput. Vis. Pattern Recog., 2022, pp. 11 461–11 471.
  40. M. Kim, F. Liu, A. Jain, and X. Liu, “Dcface: Synthetic face generation with dual condition diffusion model,” in IEEE Conf. Comput. Vis. Pattern Recog., 2023, pp. 12 715–12 725.
  41. L. Zhang, A. Rao, and M. Agrawala, “Adding conditional control to text-to-image diffusion models,” arXiv preprint arXiv:2302.05543, 2023.
  42. R. Mokady, A. Hertz, K. Aberman, Y. Pritch, and D. Cohen-Or, “Null-text inversion for editing real images using guided diffusion models,” in IEEE Conf. Comput. Vis. Pattern Recog., 2023, pp. 6038–6047.
  43. W. Wu, Y. Zhao, H. Chen, Y. Gu, R. Zhao, Y. He, H. Zhou, M. Z. Shou, and C. Shen, “Datasetdm: Synthesizing data with perception annotations using diffusion models,” Advances in Neural Information Processing Systems, vol. 36, 2024.
  44. J. Wyatt, A. Leach, S. M. Schmon, and C. G. Willcocks, “Anoddpm: Anomaly detection with denoising diffusion probabilistic models using simplex noise,” in IEEE Conf. Comput. Vis. Pattern Recog., 2022, pp. 650–656.
  45. X. Zhang, N. Li, J. Li, T. Dai, Y. Jiang, and S.-T. Xia, “Unsupervised surface anomaly detection with diffusion probabilistic model,” in Int. Conf. Comput. Vis., 2023, pp. 6782–6791.
  46. J. Wolleb, F. Bieder, R. Sandkühler, and P. C. Cattin, “Diffusion models for medical anomaly detection,” in International Conference on Medical image computing and computer-assisted intervention.   Springer, 2022, pp. 35–45.
  47. H. Xu, S. Xu, and W. Yang, “Unsupervised industrial anomaly detection with diffusion models,” Journal of Visual Communication and Image Representation, vol. 97, p. 103983, 2023.
  48. A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, G. Krueger, and I. Sutskever, “Learning transferable visual models from natural language supervision,” in AAAI Conference on Artificial Intelligence, 2021. [Online]. Available: https://api.semanticscholar.org/CorpusID:231591445
  49. J. Li, D. Li, C. Xiong, and S. C. H. Hoi, “Blip: Bootstrapping language-image pre-training for unified vision-language understanding and generation,” in AAAI Conference on Artificial Intelligence, 2022. [Online]. Available: https://api.semanticscholar.org/CorpusID:246411402
  50. A. Van Den Oord, O. Vinyals et al., “Neural discrete representation learning,” Advances in neural information processing systems, vol. 30, 2017.
  51. I. Loshchilov and F. Hutter, “Decoupled weight decay regularization,” in Int. Conf. Learn. Represent., 2017. [Online]. Available: https://api.semanticscholar.org/CorpusID:53592270
  52. C. Cortes and V. Vapnik, “Support-vector networks,” Machine learning, vol. 20, pp. 273–297, 1995.
  53. Y. Choi, Y. Uh, J. Yoo, and J.-W. Ha, “Stargan v2: Diverse image synthesis for multiple domains,” in IEEE Conf. Comput. Vis. Pattern Recog., 2020, pp. 8188–8197.
  54. T. Karras, M. Aittala, J. Hellsten, S. Laine, J. Lehtinen, and T. Aila, “Training generative adversarial networks with limited data,” in Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, Eds., vol. 33.   Curran Associates, Inc., 2020, pp. 12 104–12 114. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2020/file/8d30aa96e72440759f74bd2306c1fa3d-Paper.pdf
  55. P. Mishra, R. Verk, D. Fornasier, C. Piciarelli, and G. L. Foresti, “Vt-adl: A vision transformer network for image anomaly detection and localization,” in International Symposium on Industrial Electronics.   IEEE, 2021, pp. 01–06.
  56. J. Božič, D. Tabernik, and D. Skočaj, “Mixed supervision for surface-defect detection: From weakly to fully supervised learning,” Computers in Industry, vol. 129, p. 103459, 2021.
  57. P. Bergmann, M. Fauser, D. Sattlegger, and C. Steger, “Uninformed students: Student-teacher anomaly detection with discriminative latent embeddings,” in IEEE Conf. Comput. Vis. Pattern Recog., 2020, pp. 4183–4192.
  58. T. Saito and M. Rehmsmeier, “The precision-recall plot is more informative than the roc plot when evaluating binary classifiers on imbalanced datasets,” PloS one, vol. 10, no. 3, p. e0118432, 2015.
  59. M. Bińkowski, D. J. Sutherland, M. Arbel, and A. Gretton, “Demystifying mmd gans,” 2018.
  60. R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang, “The unreasonable effectiveness of deep features as a perceptual metric,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 586–595.
  61. K.-M. Chung, W.-C. Kao, C.-L. Sun, L.-L. Wang, and C.-J. Lin, “Radius margin bounds for support vector machines with the rbf kernel,” Neural computation, vol. 15, no. 11, pp. 2643–2681, 2003.
  62. C. Schuhmann, R. Beaumont, R. Vencu, C. Gordon, R. Wightman, M. Cherti, T. Coombes, A. Katta, C. Mullis, M. Wortsman et al., “Laion-5b: An open large-scale dataset for training next generation image-text models,” Advances in Neural Information Processing Systems, vol. 35, pp. 25 278–25 294, 2022.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Hanxi Li (15 papers)
  2. Zhengxun Zhang (1 paper)
  3. Hao Chen (1006 papers)
  4. Lin Wu (78 papers)
  5. Bo Li (1107 papers)
  6. Deyin Liu (13 papers)
  7. Mingwen Wang (17 papers)

Summary

A New Frontier in Industrial Anomaly Detection: AdaBLDM's Novel Approach to Synthetic Defect Generation

Introduction

The field of industrial anomaly detection (AD) is quintessentially challenged by the scarcity of defective samples, which are vital for the robust training of models intended to identify and categorize manufacturing defects. In a novel approach to address this bottleneck, this paper introduces the AdaBLDM algorithm—a highly customized iteration of the Blended Latent Diffusion Model (BLDM)—geared towards the generation of realistic synthetic defective samples. This work stands out by adeptly marrying the generative prowess of diffusion models with specific tailoring for industrial defect generation, thus offering a noteworthy contribution to enhancing anomaly detection accuracies.

The Core of AdaBLDM

AdaBLDM distinguishes itself by making several key advancements to the foundational BLDM framework. Primarily, it introduces a controlled generation of defects through a novel "defect trimap" mechanism and text-based prompts, ensuring that generated defects are not only high in quality but also accurately representative of a variety of defect patterns. This tailored generation is further refined through a multi-stage editing and inference process, incorporating both latent and pixel space modifications, and an innovative online adaptation of the image decoder. These technical refinements culminate in a model capable of producing synthetic defective samples that are significantly superior in diversity and quality to those generated by existing state-of-the-art methods.

Experimental Validation

The validation of the AdaBLDM algorithm on the MVTec AD dataset, BTAD, and KSDD2 reinforces its potential as a groundbreaking tool in the domain of industrial defect generation. By augmenting the training sets of anomaly detection models with synthetically generated defects, AdaBLDM has consistently demonstrated a capability to significantly elevate the performance metrics of anomaly detection—surpassing traditional approaches and other state-of-the-art generative methods. This is not only a testament to the fidelity and applicability of the synthetic samples generated by AdaBLDM but also underscores the algorithm's practical value in addressing the perennial challenge of sample scarcity in industrial anomaly detection tasks.

Implications and Future Directions

The implications of AdaBLDM's success extend far beyond its immediate application in industrial defect generation. Its methodology provides a robust framework for further exploration and innovation in the application of generative models within industrial quality control and beyond. Looking ahead, there is vast potential in enhancing the algorithm's efficacy through the integration of more detailed defect descriptions in the text prompts, thereby achieving a finer granularity in the synthesized defects. Moreover, optimizing the diffusion model's inference speed could enable the generation of a larger volume of synthetic samples within practical time constraints, further broadening the algorithm's applicability and impact.

Conclusion

The AdaBLDM algorithm represents a significant stride forward in leveraging the capabilities of generative AI for industrial anomaly detection. By generating realistic and diverse defective samples tailored to the specific demands of anomaly detection tasks, AdaBLDM not only addresses a critical bottleneck in AD model training but also paves the way for future innovations in the field of generative AI. As we move forward, the continuous refinement and application of AdaBLDM hold the promise of unlocking new frontiers in not just industrial quality control but in the broader context of applying AI to tackle challenging real-world problems.