A Novel Approach to Industrial Defect Generation through Blended Latent Diffusion Model with Online Adaptation (2402.19330v2)
Abstract: Effectively addressing the challenge of industrial Anomaly Detection (AD) necessitates an ample supply of defective samples, a constraint often hindered by their scarcity in industrial contexts. This paper introduces a novel algorithm designed to augment defective samples, thereby enhancing AD performance. The proposed method tailors the blended latent diffusion model for defect sample generation, employing a diffusion model to generate defective samples in the latent space. A feature editing process, controlled by a ``trimap" mask and text prompts, refines the generated samples. The image generation inference process is structured into three stages: a free diffusion stage, an editing diffusion stage, and an online decoder adaptation stage. This sophisticated inference strategy yields high-quality synthetic defective samples with diverse pattern variations, leading to significantly improved AD accuracies based on the augmented training set. Specifically, on the widely recognized MVTec AD dataset, the proposed method elevates the state-of-the-art (SOTA) performance of AD with augmented data by 1.5%, 1.9%, and 3.1% for AD metrics AP, IAP, and IAP90, respectively. The implementation code of this work can be found at the GitHub repository https://github.com/GrandpaXun242/AdaBLDM.git
- H. M. Schlüter, J. Tan, B. Hou, and B. Kainz, “Natural synthetic anomalies for self-supervised anomaly detection and localization,” in Eur. Conf. Comput. Vis., S. Avidan, G. Brostow, M. Cissé, G. M. Farinella, and T. Hassner, Eds. Cham: Springer Nature Switzerland, 2022, pp. 474–489.
- T. Defard, A. Setkov, A. Loesch, and R. Audigier, “Padim: a patch distribution modeling framework for anomaly detection and localization,” in International Conference on Pattern Recognition. Springer, 2021, pp. 475–489.
- V. Zavrtanik, M. Kristan, and D. Skočaj, “DrÆm – a discriminatively trained reconstruction embedding for surface anomaly detection,” in Int. Conf. Comput. Vis., 2021, pp. 8310–8319.
- K. Roth, L. Pemula, J. Zepeda, B. Schölkopf, T. Brox, and P. Gehler, “Towards total recall in industrial anomaly detection,” in IEEE Conf. Comput. Vis. Pattern Recog., June 2022, pp. 14 318–14 328.
- M. Yang, P. Wu, and H. Feng, “Memseg: A semi-supervised method for image surface defect detection using differences and commonalities,” Engineering Applications of Artificial Intelligence, vol. 119, p. 105835, 2023.
- X. Zhang, S. Li, X. Li, P. Huang, J. Shan, and T. Chen, “Destseg: Segmentation guided denoising student-teacher for anomaly detection,” in IEEE Conf. Comput. Vis. Pattern Recog., 2023, pp. 3914–3923.
- W. Liu, H. Chang, B. Ma, S. Shan, and X. Chen, “Diversity-measurable anomaly detection,” in IEEE Conf. Comput. Vis. Pattern Recog., 2023, pp. 12 147–12 156.
- B. Scholkopf, R. Williamson, A. Smola, J. Shawe-Taylor, J. Platt et al., “Support vector method for novelty detection,” Adv. Neural Inform. Process. Syst., vol. 12, no. 3, pp. 582–588, 2000.
- V. Chandola, A. Banerjee, and V. Kumar, “Anomaly detection: A survey,” ACM computing surveys (CSUR), vol. 41, no. 3, pp. 1–58, 2009.
- C. Huang, Z. Kang, and H. Wu, “A prototype-based neural network for image anomaly detection and localization,” arXiv preprint arXiv:2310.02576, 2023.
- H. Li, J. Hu, B. Li, H. Chen, Y. Zheng, and C. Shen, “Target before shooting: Accurate anomaly detection and localization under one millisecond via cascade patch retrieval,” arXiv preprint arXiv:2308.06748, 2023.
- H. Li, J. Wu, H. Chen, M. Wang, and C. Shen, “Efficient anomaly detection with budget annotation using semi-supervised residual transformer,” arXiv preprint arXiv:2306.03492, 2023.
- J. Wei, F. Shen, C. Lv, Z. Zhang, F. Zhang, and H. Yang, “Diversified and multi-class controllable industrial defect synthesis for data augmentation and transfer,” in IEEE Conf. Comput. Vis. Pattern Recog. Worksh., 2023, pp. 4444–4452.
- Y. Duan, Y. Hong, L. Niu, and L. Zhang, “Few-shot defect image generation via defect-aware feature manipulation,” in AAAI, vol. 37, no. 1, 2023, pp. 571–578.
- J. Sohl-Dickstein, E. Weiss, N. Maheswaranathan, and S. Ganguli, “Deep unsupervised learning using nonequilibrium thermodynamics,” in AAAI Conference on Artificial Intelligence. PMLR, 2015, pp. 2256–2265.
- R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High-resolution image synthesis with latent diffusion models,” in IEEE Conf. Comput. Vis. Pattern Recog., 2022, pp. 10 684–10 695.
- O. Avrahami, D. Lischinski, and O. Fried, “Blended diffusion for text-driven editing of natural images,” in IEEE Conf. Comput. Vis. Pattern Recog., 2022, pp. 18 208–18 218.
- O. Avrahami, O. Fried, and D. Lischinski, “Blended latent diffusion,” ACM Transactions on Graphics, vol. 42, no. 4, pp. 1–11, 2023.
- P. Bergmann, M. Fauser, D. Sattlegger, and C. Steger, “Mvtec ad–a comprehensive real-world dataset for unsupervised anomaly detection,” in IEEE Conf. Comput. Vis. Pattern Recog., 2019, pp. 9592–9600.
- T. DeVries and G. W. Taylor, “Improved regularization of convolutional neural networks with cutout,” arXiv preprint arXiv:1708.04552, 2017.
- C.-L. Li, K. Sohn, J. Yoon, and T. Pfister, “Cutpaste: Self-supervised learning for anomaly detection and localization,” in IEEE Conf. Comput. Vis. Pattern Recog., 2021, pp. 9664–9674.
- D. Lin, Y. Cao, W. Zhu, and Y. Li, “Few-shot defect segmentation leveraging abundant defect-free training samples through normal background regularization and crop-and-paste operation,” in Int. Conf. Multimedia and Expo, 2021, pp. 1–6.
- H. Zhang, Z. Wu, Z. Wang, Z. Chen, and Y.-G. Jiang, “Prototypical residual networks for anomaly detection and localization,” in IEEE Conf. Comput. Vis. Pattern Recog., 2023, pp. 16 281–16 291.
- J. Niu, Q. Yu, S. Dong, Z. Wang, K. Dang, and xiaowei ding, “Resynthdetect: A fundus anomaly detection network with reconstruction and synthetic features,” in Brit. Mach. Vis. Conf. BMVA, 2023. [Online]. Available: https://papers.bmvc2023.org/0099.pdf
- W. Xia, Y. Zhang, Y. Yang, J.-H. Xue, B. Zhou, and M.-H. Yang, “Gan inversion: A survey,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 3, pp. 3121–3138, 2022.
- L. Yang, Z. Zhang, Y. Song, S. Hong, R. Xu, Y. Zhao, W. Zhang, B. Cui, and M.-H. Yang, “Diffusion models: A comprehensive survey of methods and applications,” ACM Computing Surveys, vol. 56, no. 4, pp. 1–39, 2023.
- S. Niu, B. Li, X. Wang, and H. Lin, “Defect image sample generation with gan for improving defect recognition,” IEEE Transactions on Automation Science and Engineering, vol. 17, no. 3, pp. 1611–1622, 2020.
- G. Zhang, K. Cui, T.-Y. Hung, and S. Lu, “Defect-gan: High-fidelity defect synthesis for automated defect inspection,” in IEEE/CVF Winter Conference on Applications of Computer Vision, 2021, pp. 2523–2533.
- T. Karras, S. Laine, M. Aittala, J. Hellsten, J. Lehtinen, and T. Aila, “Analyzing and improving the image quality of stylegan,” IEEE Conf. Comput. Vis. Pattern Recog., pp. 8107–8116, 2019. [Online]. Available: https://api.semanticscholar.org/CorpusID:209202273
- I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” Advances in neural information processing systems, vol. 27, 2014.
- D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” arXiv preprint arXiv:1312.6114, 2013.
- L. Dinh, D. Krueger, and Y. Bengio, “Nice: Non-linear independent components estimation,” arXiv preprint arXiv:1410.8516, 2014.
- J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” Advances in neural information processing systems, vol. 33, pp. 6840–6851, 2020.
- J. Song, C. Meng, and S. Ermon, “Denoising diffusion implicit models,” arXiv preprint arXiv:2010.02502, 2020.
- P. Dhariwal and A. Nichol, “Diffusion models beat gans on image synthesis,” Advances in neural information processing systems, vol. 34, pp. 8780–8794, 2021.
- A. Ramesh, P. Dhariwal, A. Nichol, C. Chu, and M. Chen, “Hierarchical text-conditional image generation with clip latents,” arXiv preprint arXiv:2204.06125, vol. 1, no. 2, p. 3, 2022.
- R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High-resolution image synthesis with latent diffusion models,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 10 684–10 695.
- C. Saharia, W. Chan, S. Saxena, L. Li, J. Whang, E. L. Denton, K. Ghasemipour, R. Gontijo Lopes, B. Karagol Ayan, T. Salimans et al., “Photorealistic text-to-image diffusion models with deep language understanding,” Advances in Neural Information Processing Systems, vol. 35, pp. 36 479–36 494, 2022.
- A. Lugmayr, M. Danelljan, A. Romero, F. Yu, R. Timofte, and L. Van Gool, “Repaint: Inpainting using denoising diffusion probabilistic models,” in IEEE Conf. Comput. Vis. Pattern Recog., 2022, pp. 11 461–11 471.
- M. Kim, F. Liu, A. Jain, and X. Liu, “Dcface: Synthetic face generation with dual condition diffusion model,” in IEEE Conf. Comput. Vis. Pattern Recog., 2023, pp. 12 715–12 725.
- L. Zhang, A. Rao, and M. Agrawala, “Adding conditional control to text-to-image diffusion models,” arXiv preprint arXiv:2302.05543, 2023.
- R. Mokady, A. Hertz, K. Aberman, Y. Pritch, and D. Cohen-Or, “Null-text inversion for editing real images using guided diffusion models,” in IEEE Conf. Comput. Vis. Pattern Recog., 2023, pp. 6038–6047.
- W. Wu, Y. Zhao, H. Chen, Y. Gu, R. Zhao, Y. He, H. Zhou, M. Z. Shou, and C. Shen, “Datasetdm: Synthesizing data with perception annotations using diffusion models,” Advances in Neural Information Processing Systems, vol. 36, 2024.
- J. Wyatt, A. Leach, S. M. Schmon, and C. G. Willcocks, “Anoddpm: Anomaly detection with denoising diffusion probabilistic models using simplex noise,” in IEEE Conf. Comput. Vis. Pattern Recog., 2022, pp. 650–656.
- X. Zhang, N. Li, J. Li, T. Dai, Y. Jiang, and S.-T. Xia, “Unsupervised surface anomaly detection with diffusion probabilistic model,” in Int. Conf. Comput. Vis., 2023, pp. 6782–6791.
- J. Wolleb, F. Bieder, R. Sandkühler, and P. C. Cattin, “Diffusion models for medical anomaly detection,” in International Conference on Medical image computing and computer-assisted intervention. Springer, 2022, pp. 35–45.
- H. Xu, S. Xu, and W. Yang, “Unsupervised industrial anomaly detection with diffusion models,” Journal of Visual Communication and Image Representation, vol. 97, p. 103983, 2023.
- A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, G. Krueger, and I. Sutskever, “Learning transferable visual models from natural language supervision,” in AAAI Conference on Artificial Intelligence, 2021. [Online]. Available: https://api.semanticscholar.org/CorpusID:231591445
- J. Li, D. Li, C. Xiong, and S. C. H. Hoi, “Blip: Bootstrapping language-image pre-training for unified vision-language understanding and generation,” in AAAI Conference on Artificial Intelligence, 2022. [Online]. Available: https://api.semanticscholar.org/CorpusID:246411402
- A. Van Den Oord, O. Vinyals et al., “Neural discrete representation learning,” Advances in neural information processing systems, vol. 30, 2017.
- I. Loshchilov and F. Hutter, “Decoupled weight decay regularization,” in Int. Conf. Learn. Represent., 2017. [Online]. Available: https://api.semanticscholar.org/CorpusID:53592270
- C. Cortes and V. Vapnik, “Support-vector networks,” Machine learning, vol. 20, pp. 273–297, 1995.
- Y. Choi, Y. Uh, J. Yoo, and J.-W. Ha, “Stargan v2: Diverse image synthesis for multiple domains,” in IEEE Conf. Comput. Vis. Pattern Recog., 2020, pp. 8188–8197.
- T. Karras, M. Aittala, J. Hellsten, S. Laine, J. Lehtinen, and T. Aila, “Training generative adversarial networks with limited data,” in Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, Eds., vol. 33. Curran Associates, Inc., 2020, pp. 12 104–12 114. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2020/file/8d30aa96e72440759f74bd2306c1fa3d-Paper.pdf
- P. Mishra, R. Verk, D. Fornasier, C. Piciarelli, and G. L. Foresti, “Vt-adl: A vision transformer network for image anomaly detection and localization,” in International Symposium on Industrial Electronics. IEEE, 2021, pp. 01–06.
- J. Božič, D. Tabernik, and D. Skočaj, “Mixed supervision for surface-defect detection: From weakly to fully supervised learning,” Computers in Industry, vol. 129, p. 103459, 2021.
- P. Bergmann, M. Fauser, D. Sattlegger, and C. Steger, “Uninformed students: Student-teacher anomaly detection with discriminative latent embeddings,” in IEEE Conf. Comput. Vis. Pattern Recog., 2020, pp. 4183–4192.
- T. Saito and M. Rehmsmeier, “The precision-recall plot is more informative than the roc plot when evaluating binary classifiers on imbalanced datasets,” PloS one, vol. 10, no. 3, p. e0118432, 2015.
- M. Bińkowski, D. J. Sutherland, M. Arbel, and A. Gretton, “Demystifying mmd gans,” 2018.
- R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang, “The unreasonable effectiveness of deep features as a perceptual metric,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 586–595.
- K.-M. Chung, W.-C. Kao, C.-L. Sun, L.-L. Wang, and C.-J. Lin, “Radius margin bounds for support vector machines with the rbf kernel,” Neural computation, vol. 15, no. 11, pp. 2643–2681, 2003.
- C. Schuhmann, R. Beaumont, R. Vencu, C. Gordon, R. Wightman, M. Cherti, T. Coombes, A. Katta, C. Mullis, M. Wortsman et al., “Laion-5b: An open large-scale dataset for training next generation image-text models,” Advances in Neural Information Processing Systems, vol. 35, pp. 25 278–25 294, 2022.
- Hanxi Li (15 papers)
- Zhengxun Zhang (1 paper)
- Hao Chen (1006 papers)
- Lin Wu (78 papers)
- Bo Li (1107 papers)
- Deyin Liu (13 papers)
- Mingwen Wang (17 papers)