No-Service Rail Surface Defect Segmentation via Normalized Attention and Dual-scale Interaction (2306.15442v1)
Abstract: No-service rail surface defect (NRSD) segmentation is an essential way for perceiving the quality of no-service rails. However, due to the complex and diverse outlines and low-contrast textures of no-service rails, existing natural image segmentation methods cannot achieve promising performance in NRSD images, especially in some unique and challenging NRSD scenes. To this end, in this paper, we propose a novel segmentation network for NRSDs based on Normalized Attention and Dual-scale Interaction, named NaDiNet. Specifically, NaDiNet follows the enhancement-interaction paradigm. The Normalized Channel-wise Self-Attention Module (NAM) and the Dual-scale Interaction Block (DIB) are two key components of NaDiNet. NAM is a specific extension of the channel-wise self-attention mechanism (CAM) to enhance features extracted from low-contrast NRSD images. The softmax layer in CAM will produce very small correlation coefficients which are not conducive to low-contrast feature enhancement. Instead, in NAM, we directly calculate the normalized correlation coefficient between channels to enlarge the feature differentiation. DIB is specifically designed for the feature interaction of the enhanced features. It has two interaction branches with dual scales, one for fine-grained clues and the other for coarse-grained clues. With both branches working together, DIB can perceive defect regions of different granularities. With these modules working together, our NaDiNet can generate accurate segmentation map. Extensive experiments on the public NRSD-MN dataset with man-made and natural NRSDs demonstrate that our proposed NaDiNet with various backbones (i.e., VGG, ResNet, and DenseNet) consistently outperforms 10 state-of-the-art methods. The code and results of our method are available at https://github.com/monxxcn/NaDiNet.
- J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in Proc. IEEE CVPR, Jun. 2015, pp. 3431–3440.
- V. Badrinarayanan, A. Kendall, and R. Cipolla, “SegNet: A deep convolutional encoder-decoder architecture for image segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 12, pp. 2481–2495, Dec. 2017.
- L.-C. Chen, G. Papandreou, F. Schroff, and H. Adam, “Rethinking atrous convolution for semantic image segmentation,” arXiv preprint arXiv:1706.05587, 2017.
- G. Li, Y. Wang, Z. Liu, X. Zhang, and D. Zeng, “RGB-T semantic segmentation with location, activation, and sharpening,” IEEE Trans. Circuits Syst. Video Technol., vol. 33, no. 3, pp. 1223–1235, Mar. 2023.
- N. Neogi, D. K. Mohanta, and P. K. Dutta, “Review of vision-based steel surface inspection systems,” EURASIP J. Image. Video. Process., pp. 1–19, Nov. 2014.
- Q. Luo, X. Fang, L. Liu, C. Yang, and Y. Sun, “Automated visual defect detection for flat steel surface: A survey,” IEEE Trans. Instrum. Meas., vol. 69, no. 3, pp. 626–644, Mar. 2020.
- B. Tang, L. Chen, W. Sun, and Z. kang Lin, “Review of surface defect detection of steel products based on machine vision,” IET Image Process., vol. 17, no. 2, pp. 303–322, 2023.
- X. Jin, Y. Wang, H. Zhang, H. Zhong, L. Liu, Q. M. J. Wu, and Y. Yang, “DM-RIS: Deep multimodel rail inspection system with improved MRF-GMM and CNN,” IEEE Trans. Instrum. Meas., vol. 69, no. 4, pp. 1051–1065, Apr. 2020.
- D. Zhang, K. Song, J. Xu, Y. He, M. Niu, and Y. Yan, “MCnet: Multiple context information segmentation network of no-service rail surface defects,” IEEE Trans. Instrum. Meas., vol. 70, pp. 1–9, Jan. 2021.
- D. Zhang, K. Song, J. Xu, H. Dong, and Y. Yan, “An image-level weakly supervised segmentation method for no-service rail surface defect with size prior,” Mech. Syst. Signal Process., vol. 165, pp. 1–14, Feb. 2022.
- J. Wang, K. Song, D. Zhang, M. Niu, and Y. Yan, “Collaborative learning attention network based on RGB image and depth image for surface defect inspection of no-service rail,” IEEE/ASME Trans. Mechatron., vol. 27, no. 6, pp. 4874–4884, Sept. 2022.
- J. Wu, W. Zhou, W. Qiu, and L. Yu, “Depth repeated-enhancement RGB network for rail surface defect inspection,” IEEE Signal Process. Lett., vol. 29, pp. 2053–2057, Sept. 2022.
- W. Zhou and J. Hong, “FHENet: Lightweight feature hierarchical exploration network for real-time rail surface defect inspection in RGB-D images,” IEEE Trans. Instrum. Meas., vol. 72, pp. 1–8, Jan. 2023.
- O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional networks for biomedical image segmentation,” in Proc. MICCAI, Oct. 2015, pp. 234–241.
- G. Li, Z. Liu, and H. Ling, “ICNet: Information conversion network for RGB-D based salient object detection,” IEEE Trans. Image Process., vol. 29, pp. 4873–4884, Mar. 2020.
- G. Li, Z. Liu, M. Chen, Z. Bai, W. Lin, and H. Ling, “Hierarchical alternate interaction network for RGB-D salient object detection,” IEEE Trans. Image Process., vol. 30, pp. 3528–3542, Mar. 2021.
- G. Li, Z. Liu, L. Ye, Y. Wang, and H. Ling, “Cross-modal weighting network for RGB-D salient object detection,” in Proc. ECCV, Aug. 2020, pp. 665–681.
- D.-P. Fan, Y. Zhai, A. Borji, J. Yang, and L. Shao, “BBS-Net: RGB-D salient object detection with a bifurcated backbone strategy network,” in Proc. ECCV, Aug. 2020, pp. 275–292.
- Y. LeCun et al., “Backpropagation applied to handwritten zip code recognition,” Neural Comput., vol. 1, no. 4, pp. 541–551, Dec. 1989.
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” in Proc. NeurIPS, Dec. 2017, pp. 6000–6010.
- J. Fu, J. Liu, H. Tian, Y. Li, Y. Bao, Z. Fang, and H. Lu, “Dual attention network for scene segmentation,” in Proc. IEEE CVPR, Jun. 2019, pp. 3146–3154.
- H. Noh, S. Hong, and B. Han, “Learning deconvolution network for semantic segmentation,” in Proc. IEEE ICCV, Dec. 2015, pp. 1520–1528.
- L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “Semantic image segmentation with deep convolutional nets and fully connected crfs,” in ICLR, May 2015, pp. 1–14.
- ——, “DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 40, no. 4, pp. 834–848, Apr. 2018.
- L.-C. Chen, Y. Zhu, G. Papandreou, F. Schroff, and H. Adam, “Encoder-decoder with atrous separable convolution for semantic image segmentation,” in Proc. ECCV, Sept. 2018, pp. 833–851.
- M. Yang, K. Yu, C. Zhang, Z. Li, and K. Yang, “DenseASPP for semantic segmentation in street scenes,” in Proc. IEEE CVPR, Jun. 2018, pp. 3684–3692.
- H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia, “Pyramid scene parsing network,” in Proc. IEEE CVPR, Jul. 2017, pp. 6230–6239.
- Y. Yuan and J. Wang, “OCNet: Object context network for scene parsing,” arXiv preprint arXiv:1809.00916, 2018.
- Z. Huang, X. Wang, L. Huang, C. Huang, Y. Wei, and W. Liu, “CCNet: Criss-cross attention for semantic segmentation,” in Proc. IEEE ICCV, Oct. 2019, pp. 603–612.
- A. Dosovitskiy et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” in Proc. ICLR, 2021, pp. 1–22.
- E. Xie, W. Wang, Z. Yu, A. Anandkuma, J. M. Alvarez, and P. Luo, “SegFormer: Simple and efficient design for semantic segmentation with transformers,” in Proc. NeurIPS, Dec. 2021, pp. 12 077–12 090.
- R. Strudel, R. Garcia, I. Laptev, and C. Schmid, “Segmenter: Transformer for semantic segmentation,” in Proc. IEEE ICCV, Oct. 2021, pp. 7242–7252.
- N. Liu, J. Han, and M.-H. Yang, “PiCANet: Learning pixel-wise contextual attention for saliency detection,” in Proc. IEEE CVPR, Jun. 2018, pp. 3089–3098.
- K. Qian, “Automated detection of steel defects via machine learning based on real-time semantic segmentation,” in Proc. ICVIP, Dec. 2019, pp. 42–46.
- Y. Pan and L. Zhang, “Dual attention deep learning network for automatic steel surface defect segmentation,” Comput.-Aided Civ. Inf., vol. 37, no. 11, pp. 1468–1487, Sept. 2022.
- X. Zhao, J. Zhao, and Z. He, “A multiple feature-maps interaction pyramid network for defect detection of steel surface,” Meas. Sci. Technol., vol. 34, no. 5, pp. 1–10, Jan. 2023.
- S. Zhou, S. Wu, H. Liu, Y. Lu, and N. Hu, “Double low-rank and sparse decomposition for surface defect segmentation of steel sheet,” Appl. Sci., vol. 8, no. 9, p. 1628, Sept. 2018.
- S. Zhou, H. Liu, K. Cui, and Z. Hao, “JCS: An explainable surface defects detection method for steel sheet by joint classification and segmentation,” IEEE Access, vol. 9, pp. 140 116–140 135, Oct. 2021.
- G. Song, K. Song, and Y. Yan, “EDRNet: Encoder-decoder residual network for salient object detection of strip steel surface defects,” IEEE Trans. Instrum. Meas., vol. 69, no. 12, pp. 9709–9719, Dec. 2020.
- X. Zhou, H. Fang, X. Fei, R. Shi, and J. Zhang, “Edge-aware multi-level interactive network for salient object detection of strip steel surface defects,” IEEE Access, vol. 9, pp. 149 465–149 476, Nov. 2021.
- X. Zhou, H. Fang, Z. Liu, B. Zheng, Y. Sun, J. Zhang, and C. Yan, “Dense attention-guided cascaded network for salient object detection of strip steel surface defects,” IEEE Trans. Instrum. Meas., vol. 71, pp. 1–14, Mar. 2022.
- T. Ding, G. Li, Z. Liu, and Y. Wang, “Cross-scale edge purification network for salient object detection of steel defect images,” Meas., vol. 199, pp. 1–11, Aug. 2022.
- C. Han, G. Li, and Z. Liu, “Two-stage edge reuse network for salient object detection of strip steel surface defects,” IEEE Trans. Instrum. Meas., vol. 71, pp. 1–12, Aug. 2022.
- H. Feng, K. Song, W. Cui, Y. Zhang, and Y. Yan, “Cross position aggregation network for few-shot strip steel surface defect segmentation,” IEEE Trans. Instrum. Meas., vol. 72, pp. 1–10, Feb. 2023.
- G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” in Proc. IEEE CVPR, Jul. 2017, pp. 2261–2269.
- K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
- K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. IEEE CVPR, Jun. 2016, pp. 770–778.
- S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, “CBAM: Convolutional block attention module,” in Proc. ECCV, Sep. 2018, pp. 3–19.
- C.-Y. Lee, S. Xie, P. Gallagher, Z. Zhang, and Z. Tu, “Deeply-supervised nets,” in Proc. AAAI, vol. 38, May 2015, pp. 562–570.
- G. Li, Z. Liu, D. Z. Lin, and H. Ling, “Adjacent context coordination network for salient object detection in optical remote sensing images,” IEEE Trans. Cybern., vol. 53, no. 1, pp. 526–538, Jan. 2023.
- A. Paszke et al., “PyTorch: An imperative style, high-performance deep learning library,” in Proc. NeurIPS, Dec. 2019, pp. 8024–8035.
- K. He, X. Zhang, S. Ren, and J. Sun, “Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification,” in Proc. IEEE ICCV, Dec. 2015, pp. 1026–1034.
- D. P. Kingma and J. L. Ba, “Adam: A method for stochastic optimization,” in Proc. ICLR, May 2015, pp. 1–15.
- A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. C. Berg, W.-Y. Lo, P. Dollár, and R. B. Girshick, “Segment anything,” arXiv preprint arXiv:2304.02643, 2023.
Sponsored by Paperpile, the PDF & BibTeX manager trusted by top AI labs.
Get 30 days freePaper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.