Fourier Boundary Features Network with Wider Catchers for Glass Segmentation (2405.09459v2)
Abstract: Glass largely blurs the boundary between the real world and the reflection. The special transmittance and reflectance quality have confused the semantic tasks related to machine vision. Therefore, how to clear the boundary built by glass, and avoid over-capturing features as false positive information in deep structure, matters for constraining the segmentation of reflection surface and penetrating glass. We proposed the Fourier Boundary Features Network with Wider Catchers (FBWC), which might be the first attempt to utilize sufficiently wide horizontal shallow branches without vertical deepening for guiding the fine granularity segmentation boundary through primary glass semantic information. Specifically, we designed the Wider Coarse-Catchers (WCC) for anchoring large area segmentation and reducing excessive extraction from a structural perspective. We embed fine-grained features by Cross Transpose Attention (CTA), which is introduced to avoid the incomplete area within the boundary caused by reflection noise. For excavating glass features and balancing high-low layers context, a learnable Fourier Convolution Controller (FCC) is proposed to regulate information integration robustly. The proposed method has been validated on three different public glass segmentation datasets. Experimental results reveal that the proposed method yields better segmentation performance compared with the state-of-the-art (SOTA) methods in glass image segmentation.
- J. Lin, G. Wang, and R. W. Lau, “Progressive mirror detection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 3697–3705.
- J. Lin, Z. He, and R. W. Lau, “Rich context aggregation with reflection prior for glass surface detection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 13 415–13 424.
- H. He, X. Li, Y. Yang, G. Cheng, Y. Tong, L. Weng, Z. Lin, and S. Xiang, “Boundarysqueeze: Image segmentation as boundary squeezing,” arXiv preprint arXiv:2105.11668, 2021.
- C. Zheng, D. Shi, X. Yan, D. Liang, M. Wei, X. Yang, Y. Guo, and H. Xie, “Glassnet: Label decoupling-based three-stream neural network for robust image glass detection,” in Computer Graphics Forum, vol. 41, no. 1. Wiley Online Library, 2022, pp. 377–388.
- E. Xie, W. Wang, W. Wang, M. Ding, C. Shen, and P. Luo, “Segmenting transparent objects in the wild,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XIII 16. Springer, 2020, pp. 696–711.
- H. He, X. Li, G. Cheng, J. Shi, Y. Tong, G. Meng, V. Prinet, and L. Weng, “Enhanced boundary learning for glass-like object segmentation,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 15 859–15 868.
- Y. Cao, Z. Zhang, E. Xie, Q. Hou, K. Zhao, X. Luo, and J. Tuo, “Fakemix augmentation improves transparent object detection,” arXiv preprint arXiv:2103.13279, 2021.
- X. Li, X. Li, L. Zhang, G. Cheng, J. Shi, Z. Lin, S. Tan, and Y. Tong, “Improving semantic segmentation via decoupled body and edge supervision,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XVII 16. Springer, 2020, pp. 435–452.
- H. Mei, X. Yang, Y. Wang, Y. Liu, S. He, Q. Zhang, X. Wei, and R. W. Lau, “Don’t hit me! glass detection in real-world scenes,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 3687–3696.
- Y. Mao, J. Yuan, Y. Zhu, and Y. Jiang, “Surface defect detection of smartphone glass based on deep learning,” The International Journal of Advanced Manufacturing Technology, vol. 127, no. 11-12, pp. 5817–5829, 2023.
- K. Wang, H. Zhang, G. Xiao, T. Wu, X. Yuan, and Y. Wang, “Attention mechanism-based feature extractor for unsupervised glass surface defect detection,” in 2022 China Automation Congress (CAC). IEEE, 2022, pp. 2997–3002.
- H. Mei, L. Yu, K. Xu, Y. Wang, X. Yang, X. Wei, and R. W. Lau, “Mirror segmentation via semantic-aware contextual contrasted feature learning,” ACM Transactions on Multimedia Computing, Communications and Applications, vol. 19, no. 2s, pp. 1–22, 2023.
- H. Guan, J. Lin, and R. W. Lau, “Learning semantic associations for mirror detection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 5941–5950.
- L. Yu, H. Mei, W. Dong, Z. Wei, L. Zhu, Y. Wang, and X. Yang, “Progressive glass segmentation,” IEEE Transactions on Image Processing, vol. 31, pp. 2920–2933, 2022.
- X. Hu, R. Gao, S. Yang, and K. Cho, “Tgsnet: Multi-field feature fusion for glass region segmentation using transformers,” Mathematics, vol. 11, no. 4, p. 843, 2023.
- G. R. Hu, Xiaohang, S. Yang, and K. Cho, “Cagnet: A multi-scale convolutional attention method for glass detection based on transformer,” Mathematics, vol. 11, no. 19, p. 4084, 2023.
- X. Hou, M. Zhan, C. Wang, and C. Fan, “Glass objects detection based on transformer encoder-decoder,” in 2022 6th International Conference on Automation, Control and Robots (ICACR). IEEE, 2022, pp. 217–223.
- J. Lin, Y.-H. Yeung, and R. Lau, “Exploiting semantic relations for glass surface detection,” Advances in Neural Information Processing Systems, vol. 35, pp. 22 490–22 504, 2022.
- C. Tang, L. Xu, B. Yang, Y. Tang, and D. Zhao, “Gru-based interpretable multivariate time series anomaly detection in industrial control system,” Computers & Security, vol. 127, p. 103094, 2023.
- J. Ba and R. Caruana, “Do deep nets really need to be deep?” Advances in neural information processing systems, vol. 27, 2014.
- M. Tan and Q. Le, “Efficientnet: Rethinking model scaling for convolutional neural networks,” in International conference on machine learning. PMLR, 2019, pp. 6105–6114.
- Z. Chen, Q. Xu, R. Cong, and Q. Huang, “Global context-aware progressive aggregation network for salient object detection,” in Proceedings of the AAAI conference on artificial intelligence, vol. 34, no. 07, 2020, pp. 10 599–10 606.
- C. P. Chen and Z. Liu, “Broad learning system: An effective and efficient incremental learning system without the need for deep architecture,” IEEE transactions on neural networks and learning systems, vol. 29, no. 1, pp. 10–24, 2017.
- R. Xie and S. Wang, “Downsizing and enhancing broad learning systems by feature augmentation and residuals boosting,” Complex & Intelligent Systems, vol. 6, no. 2, pp. 411–429, 2020.
- Z. Liu, C. P. Chen, S. Feng, Q. Feng, and T. Zhang, “Stacked broad learning system: From incremental flatted structure to deep model,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 51, no. 1, pp. 209–222, 2020.
- R. Xie, C.-M. Vong, C. P. Chen, and S. Wang, “Dynamic network structure: Doubly stacking broad learning systems with residuals and simpler linear model transmission,” IEEE Transactions on Emerging Topics in Computational Intelligence, vol. 6, no. 6, pp. 1378–1395, 2022.
- X. Zhang, T. Wang, J. Qi, H. Lu, and G. Wang, “Progressive attention guided recurrent network for salient object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 714–722.
- S. Shi, R. Yang, and H. You, “A new two-dimensional fourier transform algorithm based on image sparsity,” in 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, 2017, pp. 1373–1377.
- D.-P. Fan, G.-P. Ji, T. Zhou, G. Chen, H. Fu, J. Shen, and L. Shao, “Pranet: Parallel reverse attention network for polyp segmentation,” in International conference on medical image computing and computer-assisted intervention. Springer, 2020, pp. 263–273.
- O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in Medical image computing and computer-assisted intervention–MICCAI 2015: 18th international conference, Munich, Germany, October 5-9, 2015, proceedings, part III 18. Springer, 2015, pp. 234–241.
- Y. Zhang, H. Liu, and Q. Hu, “Transfuse: Fusing transformers and cnns for medical image segmentation,” in Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part I 24. Springer, 2021, pp. 14–24.
- Q. Hou, M.-M. Cheng, X. Hu, A. Borji, Z. Tu, and P. H. Torr, “Deeply supervised salient object detection with short connections,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 3203–3212.
- N. Liu, J. Han, and M.-H. Yang, “Picanet: Learning pixel-wise contextual attention for saliency detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 3089–3098.
- S. Chen, X. Tan, B. Wang, and X. Hu, “Reverse attention for salient object detection,” in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 234–250.
- Z. Wu, L. Su, and Q. Huang, “Cascaded partial decoder for fast and accurate salient object detection,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 3907–3916.
- J.-X. Zhao, J.-J. Liu, D.-P. Fan, Y. Cao, J. Yang, and M.-M. Cheng, “Egnet: Edge guidance network for salient object detection,” in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 8779–8788.
- J. Wei, S. Wang, and Q. Huang, “F3net: fusion, feedback and focus for salient object detection,” in Proceedings of the AAAI conference on artificial intelligence, vol. 34, no. 07, 2020, pp. 12 321–12 328.
- Y. Pang, X. Zhao, L. Zhang, and H. Lu, “Multi-scale interactive network for salient object detection,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 9413–9422.
- H. Zhou, X. Xie, J.-H. Lai, Z. Chen, and L. Yang, “Interactive two-stream decoder for accurate and fast saliency detection,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 9141–9150.
- J. Wei, S. Wang, Z. Wu, C. Su, Q. Huang, and Q. Tian, “Label decoupling framework for salient object detection,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 13 025–13 034.
- X. Qin, Z. Zhang, C. Huang, M. Dehghan, O. R. Zaiane, and M. Jagersand, “U2-net: Going deeper with nested u-structure for salient object detection,” Pattern recognition, vol. 106, p. 107404, 2020.
- H. Zhao, X. Qi, X. Shen, J. Shi, and J. Jia, “Icnet for real-time semantic segmentation on high-resolution images,” in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 405–420.
- H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia, “Pyramid scene parsing network,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 2881–2890.
- L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs,” IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 4, pp. 834–848, 2017.
- M. Yang, K. Yu, C. Zhang, Z. Li, and K. Yang, “Denseaspp for semantic segmentation in street scenes,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 3684–3692.
- C. Yu, J. Wang, C. Peng, C. Gao, G. Yu, and N. Sang, “Bisenet: Bilateral segmentation network for real-time semantic segmentation,” in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 325–341.
- J. Fu, J. Liu, H. Tian, Y. Li, Y. Bao, Z. Fang, and H. Lu, “Dual attention network for scene segmentation,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 3146–3154.
- Z. Huang, X. Wang, L. Huang, C. Huang, Y. Wei, and W. Liu, “Ccnet: Criss-cross attention for semantic segmentation,” in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 603–612.
- X. Li, H. Zhao, L. Han, Y. Tong, S. Tan, and K. Yang, “Gated fully fusion for semantic segmentation,” in Proceedings of the AAAI conference on artificial intelligence, vol. 34, no. 07, 2020, pp. 11 418–11 425.
- X. Li, A. You, Z. Zhu, H. Zhao, M. Yang, K. Yang, S. Tan, and Y. Tong, “Semantic flow for fast and accurate scene parsing,” in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part I 16. Springer, 2020, pp. 775–793.
- S. Huang, Z. Lu, R. Cheng, and C. He, “Fapn: Feature-aligned pyramid network for dense image prediction,” in Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 864–873.
- J. Xu, Z. Xiong, and S. P. Bhattacharyya, “Pidnet: A real-time semantic segmentation network inspired by pid controllers,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2023, pp. 19 529–19 539.
- X. Hu, L. Zhu, C.-W. Fu, J. Qin, and P.-A. Heng, “Direction-aware spatial context features for shadow detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 7454–7462.
- Q. Zheng, X. Qiao, Y. Cao, and R. W. Lau, “Distraction-aware shadow detection,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 5167–5176.
- L. Zhu, Z. Deng, X. Hu, C.-W. Fu, X. Xu, J. Qin, and P.-A. Heng, “Bidirectional feature pyramid network with recurrent attention residual modules for shadow detection,” in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 121–136.
- D.-P. Fan, G.-P. Ji, G. Sun, M.-M. Cheng, J. Shen, and L. Shao, “Camouflaged object detection,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 2777–2787.
- H. Mei, G.-P. Ji, Z. Wei, X. Yang, X. Wei, and D.-P. Fan, “Camouflaged object segmentation with distraction mining,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2021, pp. 8772–8781.
- Y. Lv, J. Zhang, Y. Dai, A. Li, B. Liu, N. Barnes, and D.-P. Fan, “Simultaneously localize, segment and rank the camouflaged objects,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 11 591–11 601.
- X. Yang, H. Mei, K. Xu, X. Wei, B. Yin, and R. W. Lau, “Where is my mirror?” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 8809–8818.
- E. Xie, W. Wang, W. Wang, P. Sun, H. Xu, D. Liang, and P. Luo, “Segmenting transparent object in the wild with transformer,” arXiv preprint arXiv:2101.08461, 2021.
- K. Fan, C. Wang, Y. Wang, C. Wang, R. Yi, and L. Ma, “Rfenet: towards reciprocal feature evolution for glass segmentation,” in Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, ser. IJCAI ’23, 2023. [Online]. Available: https://doi.org/10.24963/ijcai.2023/80