Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Defect Image Sample Generation With Diffusion Prior for Steel Surface Defect Recognition (2405.01872v1)

Published 3 May 2024 in cs.CV

Abstract: The task of steel surface defect recognition is an industrial problem with great industry values. The data insufficiency is the major challenge in training a robust defect recognition network. Existing methods have investigated to enlarge the dataset by generating samples with generative models. However, their generation quality is still limited by the insufficiency of defect image samples. To this end, we propose Stable Surface Defect Generation (StableSDG), which transfers the vast generation distribution embedded in Stable Diffusion model for steel surface defect image generation. To tackle with the distinctive distribution gap between steel surface images and generated images of the diffusion model, we propose two processes. First, we align the distribution by adapting parameters of the diffusion model, adopted both in the token embedding space and network parameter space. Besides, in the generation process, we propose image-oriented generation rather than from pure Gaussian noises. We conduct extensive experiments on steel surface defect dataset, demonstrating state-of-the-art performance on generating high-quality samples and training recognition models, and both designed processes are significant for the performance.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (65)
  1. S. Ghorai, A. Mukherjee, M. Gangadaran, and P. K. Dutta, “Automatic defect detection on hot-rolled flat steel products,” IEEE Transactions on Instrumentation and Measurement, vol. 62, no. 3, pp. 612–621, 2012.
  2. S. I. Hassan, L. M. Dang, I. Mehmood, S. Im, C. Choi, J. Kang, Y.-S. Park, and H. Moon, “Underground sewer pipe condition assessment based on convolutional neural networks,” Automation in Construction, vol. 106, p. 102849, 2019.
  3. Y.-h. Ai and K. Xu, “Surface detection of continuous casting slabs based on curvelet transform and kernel locality preserving projections,” Journal of Iron and Steel Research International, vol. 20, no. 5, pp. 80–86, 2013.
  4. D.-C. Choi, Y.-J. Jeon, S. J. Lee, J. P. Yun, and S. W. Kim, “Algorithm for detecting seam cracks in steel plates using a gabor filter combination method,” Applied optics, vol. 53, no. 22, pp. 4865–4872, 2014.
  5. C. Dongyan, X. Kewen, N. Aslam, and H. Jingzhong, “Defect classification recognition on strip steel surface using second-order cone programming-relevance vector machine algorithm,” Journal of Computational and Theoretical Nanoscience, vol. 13, no. 9, pp. 6141–6148, 2016.
  6. S. Cheon, H. Lee, C. O. Kim, and S. H. Lee, “Convolutional neural network for wafer surface defect classification and the detection of unknown defect class,” IEEE Transactions on Semiconductor Manufacturing, vol. 32, no. 2, pp. 163–170, 2019.
  7. I. Konovalenko, P. Maruschak, J. Brezinová, J. Viňáš, and J. Brezina, “Steel surface defect classification using deep residual neural network,” Metals, vol. 10, no. 6, p. 846, 2020.
  8. Y. Wang, L. Gao, Y. Gao, and X. Li, “A new graph-based semi-supervised method for surface defect classification,” Robotics and Computer-Integrated Manufacturing, vol. 68, p. 102083, 2021.
  9. J. P. Yun, W. C. Shin, G. Koo, M. S. Kim, C. Lee, and S. J. Lee, “Automated defect inspection system for metal surfaces based on deep learning and data augmentation,” Journal of Manufacturing Systems, vol. 55, pp. 317–324, 2020.
  10. S. Niu, B. Li, X. Wang, and H. Lin, “Defect image sample generation with gan for improving defect recognition,” IEEE Transactions on Automation Science and Engineering, vol. 17, no. 3, pp. 1611–1622, 2020.
  11. C. Zhao, W. Xue, W. Fu, Z. Li, and X. Fang, “Defect sample image generation method based on gans in diamond tool defect detection,” IEEE Transactions on Instrumentation and Measurement, 2023.
  12. Y. Zhang, Y. Wang, Z. Jiang, F. Liao, L. Zheng, D. Tan, J. Chen, and J. Lu, “Diversifying tire-defect image generation based on generative adversarial network,” IEEE Transactions on Instrumentation and Measurement, vol. 71, pp. 1–12, 2022.
  13. G. Zhang, K. Cui, T.-Y. Hung, and S. Lu, “Defect-gan: High-fidelity defect synthesis for automated defect inspection,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2021, pp. 2524–2534.
  14. Y. Duan, Y. Hong, L. Niu, and L. Zhang, “Few-shot defect image generation via defect-aware feature manipulation,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no. 1, 2023, pp. 571–578.
  15. W. Li, C. Gu, J. Chen, C. Ma, X. Zhang, B. Chen, and S. Wan, “Dls-gan: generative adversarial nets for defect location sensitive data augmentation,” IEEE Transactions on Automation Science and Engineering, 2023.
  16. M. Ding, Z. Yang, W. Hong, W. Zheng, C. Zhou, D. Yin, J. Lin, X. Zou, Z. Shao, H. Yang et al., “Cogview: Mastering text-to-image generation via transformers,” Advances in Neural Information Processing Systems, vol. 34, pp. 19 822–19 835, 2021.
  17. A. Ramesh, M. Pavlov, G. Goh, S. Gray, C. Voss, A. Radford, M. Chen, and I. Sutskever, “Zero-shot text-to-image generation,” in International Conference on Machine Learning.   PMLR, 2021, pp. 8821–8831.
  18. A. Nichol, P. Dhariwal, A. Ramesh, P. Shyam, P. Mishkin, B. McGrew, I. Sutskever, and M. Chen, “Glide: Towards photorealistic image generation and editing with text-guided diffusion models,” arXiv preprint arXiv:2112.10741, 2021.
  19. Y. Balaji, S. Nah, X. Huang, A. Vahdat, J. Song, K. Kreis, M. Aittala, T. Aila, S. Laine, B. Catanzaro et al., “ediffi: Text-to-image diffusion models with an ensemble of expert denoisers,” arXiv preprint arXiv:2211.01324, 2022.
  20. A. Ramesh, P. Dhariwal, A. Nichol, C. Chu, and M. Chen, “Hierarchical text-conditional image generation with clip latents,” arXiv preprint arXiv:2204.06125, vol. 1, no. 2, p. 3, 2022.
  21. C. Saharia, W. Chan, S. Saxena, L. Li, J. Whang, E. L. Denton, K. Ghasemipour, R. Gontijo Lopes, B. Karagol Ayan, T. Salimans et al., “Photorealistic text-to-image diffusion models with deep language understanding,” Advances in Neural Information Processing Systems, vol. 35, pp. 36 479–36 494, 2022.
  22. R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High-resolution image synthesis with latent diffusion models,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 10 684–10 695.
  23. J. Wang, Z. Yue, S. Zhou, K. C. Chan, and C. C. Loy, “Exploiting diffusion prior for real-world image super-resolution,” arXiv preprint arXiv:2305.07015, 2023.
  24. B. Xia, Y. Zhang, S. Wang, Y. Wang, X. Wu, Y. Tian, W. Yang, and L. Van Gool, “Diffir: Efficient diffusion model for image restoration,” arXiv preprint arXiv:2303.09472, 2023.
  25. B. Kawar, S. Zada, O. Lang, O. Tov, H. Chang, T. Dekel, I. Mosseri, and M. Irani, “Imagic: Text-based real image editing with diffusion models,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 6007–6017.
  26. R. Gal, Y. Alaluf, Y. Atzmon, O. Patashnik, A. H. Bermano, G. Chechik, and D. Cohen-Or, “An image is worth one word: Personalizing text-to-image generation using textual inversion,” arXiv preprint arXiv:2208.01618, 2022.
  27. N. Ruiz, Y. Li, V. Jampani, Y. Pritch, M. Rubinstein, and K. Aberman, “Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 22 500–22 510.
  28. L. Han, Y. Li, H. Zhang, P. Milanfar, D. Metaxas, and F. Yang, “Svdiff: Compact parameter space for diffusion fine-tuning,” arXiv preprint arXiv:2303.11305, 2023.
  29. W. Chen, H. Hu, Y. Li, N. Rui, X. Jia, M.-W. Chang, and W. W. Cohen, “Subject-driven text-to-image generation via apprenticeship learning,” arXiv preprint arXiv:2304.00186, 2023.
  30. H. Chen, Y. Zhang, X. Wang, X. Duan, Y. Zhou, and W. Zhu, “Disenbooth: Disentangled parameter-efficient tuning for subject-driven text-to-image generation,” arXiv preprint arXiv:2305.03374, 2023.
  31. E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, and W. Chen, “Lora: Low-rank adaptation of large language models,” arXiv preprint arXiv:2106.09685, 2021.
  32. Y. Song, Z. Liu, S. Ling, R. Tang, G. Duan, and J. Tan, “Coarse-to-fine few-shot defect recognition with dynamic weighting and joint metric,” IEEE Transactions on Instrumentation and Measurement, vol. 71, pp. 1–10, 2022.
  33. Y. Wang, L. Gao, Y. Gao, and X. Li, “A graph guided convolutional neural network for surface defect recognition,” IEEE Transactions on Automation Science and Engineering, vol. 19, no. 3, pp. 1392–1404, 2022.
  34. T. Wang, Z. Li, Y. Xu, J. Chen, A. Genovese, V. Piuri, and F. Scotti, “Few-shot steel surface defect recognition via self-supervised teacher-student model with min-max instances similarity,” IEEE Transactions on Instrumentation and Measurement, 2023.
  35. X. Dong, C. J. Taylor, and T. F. Cootes, “Defect classification and detection using a multitask deep one-class cnn,” IEEE Transactions on Automation Science and Engineering, vol. 19, no. 3, pp. 1719–1730, 2021.
  36. D. Mo, W. K. Wong, Z. Lai, and J. Zhou, “Weighted double-low-rank decomposition with application to fabric defect detection,” IEEE Transactions on Automation Science and Engineering, vol. 18, no. 3, pp. 1170–1190, 2020.
  37. Y. Zhang, H. Wang, W. Shen, and G. Peng, “Duak: Reinforcement learning-based knowledge graph reasoning for steel surface defect detection,” IEEE Transactions on Automation Science and Engineering, 2023.
  38. Q. Huang, Y. Wu, J. Baruch, P. Jiang, and Y. Peng, “A template model for defect simulation for evaluating nondestructive testing in x-radiography,” IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans, vol. 39, no. 2, pp. 466–475, 2009.
  39. D. Mery, D. Hahn, and N. Hitschfeld, “Simulation of defects in aluminium castings using cad models of flaws and real x-ray images,” Insight-Non-Destructive Testing and Condition Monitoring, vol. 47, no. 10, pp. 618–624, 2005.
  40. D. Mery and D. Filbert, “Automated flaw detection in aluminum castings based on the tracking of potential defects in a radioscopic image sequence,” IEEE Transactions on Robotics and Automation, vol. 18, no. 6, pp. 890–901, 2002.
  41. D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” arXiv preprint arXiv:1312.6114, 2013.
  42. I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” Advances in neural information processing systems, vol. 27, 2014.
  43. J. Li, J. Jia, and D. Xu, “Unsupervised representation learning of image-based plant disease with deep convolutional generative adversarial networks,” in 2018 37th Chinese control conference (CCC).   IEEE, 2018, pp. 9159–9163.
  44. T. Karras, M. Aittala, S. Laine, E. Härkönen, J. Hellsten, J. Lehtinen, and T. Aila, “Alias-free generative adversarial networks,” Advances in Neural Information Processing Systems, vol. 34, pp. 852–863, 2021.
  45. A. Sauer, K. Schwarz, and A. Geiger, “Stylegan-xl: Scaling stylegan to large diverse datasets,” in ACM SIGGRAPH 2022 conference proceedings, 2022, pp. 1–10.
  46. X. Yang, T. Ye, X. Yuan, W. Zhu, X. Mei, and F. Zhou, “A novel data augmentation method based on denoising diffusion probabilistic model for fault diagnosis under imbalanced data,” IEEE Transactions on Industrial Informatics, 2024.
  47. J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” Advances in neural information processing systems, vol. 33, pp. 6840–6851, 2020.
  48. J. Song, C. Meng, and S. Ermon, “Denoising diffusion implicit models,” arXiv preprint arXiv:2010.02502, 2020.
  49. X. Zhang, X.-Y. Wei, J. Wu, T. Zhang, Z. Zhang, Z. Lei, and Q. Li, “Compositional inversion for stable diffusion models,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 7, 2024, pp. 7350–7358.
  50. Y. Cai, Y. Wei, Z. Ji, J. Bai, H. Han, and W. Zuo, “Decoupled textual embeddings for customized image generation,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 2, 2024, pp. 909–917.
  51. N. Kumari, B. Zhang, R. Zhang, E. Shechtman, and J.-Y. Zhu, “Multi-concept customization of text-to-image diffusion,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 1931–1941.
  52. W. Chen, H. Hu, Y. Li, N. Ruiz, X. Jia, M.-W. Chang, and W. W. Cohen, “Subject-driven text-to-image generation via apprenticeship learning,” Advances in Neural Information Processing Systems, vol. 36, 2024.
  53. O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in Medical image computing and computer-assisted intervention–MICCAI 2015: 18th international conference, Munich, Germany, October 5-9, 2015, proceedings, part III 18.   Springer, 2015, pp. 234–241.
  54. A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark et al., “Learning transferable visual models from natural language supervision,” in International conference on machine learning.   PMLR, 2021, pp. 8748–8763.
  55. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
  56. J. Ho and T. Salimans, “Classifier-free diffusion guidance,” arXiv preprint arXiv:2207.12598, 2022.
  57. M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter, “Gans trained by a two time-scale update rule converge to a local nash equilibrium,” Advances in neural information processing systems, vol. 30, 2017.
  58. A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Advances in neural information processing systems, vol. 25, 2012.
  59. K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
  60. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
  61. F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally, and K. Keutzer, “Squeezenet: Alexnet-level accuracy with 50x fewer parameters and¡ 0.5 mb model size,” arXiv preprint arXiv:1602.07360, 2016.
  62. G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 4700–4708.
  63. K. Song and Y. Yan, “A noise robust method based on completed local binary patterns for hot-rolled steel strip surface defects,” Applied Surface Science, vol. 285, pp. 858–864, 2013.
  64. R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “Stable-diffusion-v1-5,” https://huggingface.co/runwayml/stable-diffusion-v1-5.
  65. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com