Internal-External Boundary Attention Fusion for Glass Surface Segmentation (2307.00212v2)
Abstract: Glass surfaces of transparent objects and mirrors are not able to be uniquely and explicitly characterized by their visual appearances because they contain the visual appearance of other reflected or transmitted surfaces as well. Detecting glass regions from a single-color image is a challenging task. Recent deep-learning approaches have paid attention to the description of glass surface boundary where the transition of visual appearances between glass and non-glass surfaces are observed. In this work, we analytically investigate how glass surface boundary helps to characterize glass objects. Inspired by prior semantic segmentation approaches with challenging image types such as X-ray or CT scans, we propose separated internal-external boundary attention modules that individually learn and selectively integrate visual characteristics of the inside and outside region of glass surface from a single color image. Our proposed method is evaluated on six public benchmarks comparing with state-of-the-art methods showing promising results.
- Frequency-tuned salient region detection. In 2009 IEEE conference on computer vision and pattern recognition, pages 1597–1604. IEEE, 2009.
- Tesa: Tensor element self-attention via matricization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13945–13954, 2020.
- Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE transactions on pattern analysis and machine intelligence, 39(12):2481–2495, 2017.
- Automatic whole heart segmentation based on watershed and active contour model in ct images. In 2016 5th International Conference on Computer Science and Network Technology (ICCSNT), pages 741–744. IEEE, 2016.
- Reverse attention for salient object detection. In Proceedings of the European conference on computer vision (ECCV), pages 234–250, 2018.
- Graph-based global reasoning networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 433–442, 2019.
- Contour-aware loss: Boundary-aware learning for salient object segmentation. IEEE Transactions on Image Processing, 30:431–443, 2020.
- Attention-based models for speech recognition. Advances in neural information processing systems, 28, 2015.
- Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009.
- Learning to predict crisp boundaries. In Proceedings of the European Conference on Computer Vision (ECCV), pages 562–578, 2018.
- Camouflaged object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2777–2787, 2020.
- Conditional Random Fields. Probabilistic models for segmenting and labeling sequence data. In ICML 2001, 2001.
- Dual attention network for scene segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 3146–3154, 2019.
- Highly efficient salient object detection with 100k parameters. In European Conference on Computer Vision, pages 702–721. Springer, 2020.
- Skull segmentation and reconstruction from newborn ct images using coupled level sets. IEEE journal of biomedical and health informatics, 20(2):563–573, 2015.
- Learning semantic associations for mirror detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5941–5950, 2022.
- Segment anything model (sam) meets glass: Mirror and transparent objects cannot be easily detected. arXiv preprint arXiv:2305.00278, 2023.
- Enhanced boundary learning for glass-like object segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 15859–15868, 2021.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
- Deeply supervised salient object detection with short connections. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3203–3212, 2017.
- Direction-aware spatial context features for shadow detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7454–7462, 2018.
- Pointrend: Image segmentation as rendering. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9799–9808, 2020.
- Refinenet: Multi-path refinement networks for dense prediction. IEEE transactions on pattern analysis and machine intelligence, 42(5):1228–1242, 2019.
- Rich context aggregation with reflection prior for glass surface detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13415–13424, 2021.
- Progressive mirror detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3697–3705, 2020.
- Exploiting semantic relations for glass surface detection. In Advances in Neural Information Processing Systems.
- Picanet: Learning pixel-wise contextual attention for saliency detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3089–3098, 2018.
- Parsenet: Looking wider to see better. arXiv preprint arXiv:1506.04579, 2015.
- Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3431–3440, 2015.
- How to evaluate foreground maps? In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 248–255, 2014.
- Depth-aware mirror segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3044–3053, 2021.
- Don’t hit me! glass detection in real-world scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3687–3696, 2020.
- V-net: Fully convolutional neural networks for volumetric medical image segmentation. In 2016 fourth international conference on 3D vision (3DV), pages 565–571. IEEE, 2016.
- Multi-scale interactive network for salient object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9413–9422, 2020.
- Robustness of sam: Segment anything under corruptions and beyond. arXiv preprint arXiv:2306.07713, 2023.
- Basnet: Boundary-aware salient object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7479–7489, 2019.
- U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention, pages 234–241. Springer, 2015.
- Automatic lung segmentation in ct images using watershed transform. In IEEE international conference on image processing 2005, volume 2, pages II–1270. IEEE, 2005.
- Fully attentional network for semantic segmentation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 2280–2288, 2022.
- Deep high-resolution representation learning for visual recognition. IEEE transactions on pattern analysis and machine intelligence, 43(10):3349–3364, 2020.
- Non-local neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7794–7803, 2018.
- F33{}^{3}start_FLOATSUPERSCRIPT 3 end_FLOATSUPERSCRIPTnet: fusion, feedback and focus for salient object detection. In Proceedings of the AAAI conference on artificial intelligence, volume 34, pages 12321–12328, 2020.
- Segmenting transparent objects in the wild. In European conference on computer vision, pages 696–711. Springer, 2020.
- Segmenting transparent object in the wild with transformer. arXiv preprint arXiv:2101.08461, 2021.
- Where is my mirror? In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 8809–8818, 2019.
- Context prior for scene segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12416–12425, 2020.
- Bisenet: Bilateral segmentation network for real-time semantic segmentation. In Proceedings of the European conference on computer vision (ECCV), pages 325–341, 2018.
- Faster segment anything: Towards lightweight sam for mobile applications, 2023.
- Understanding segment anything model: Sam is biased towards texture rather than shape. 2023.
- One small step for generative ai, one giant leap for agi: A complete survey on chatgpt in aigc era. arXiv preprint arXiv:2304.06488, 2023.
- A complete survey on generative ai (aigc): Is chatgpt from gpt-4 to gpt-5 all you need? arXiv preprint arXiv:2303.11717, 2023.
- Single image reflection separation with perceptual losses. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4786–4794, 2018.
- Icnet for real-time semantic segmentation on high-resolution images. In Proceedings of the European conference on computer vision (ECCV), pages 405–420, 2018.
- Pyramid scene parsing network. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2881–2890, 2017.
- Glassnet: Label decoupling-based three-stream neural network for robust image glass detection. In Computer Graphics Forum, volume 41, pages 377–388. Wiley Online Library, 2022.
- Sharp eyes: A salient object detector working the same way as human visual characteristics. arXiv preprint arXiv:2301.07431, 2023.
- Bidirectional feature pyramid network with recurrent attention residual modules for shadow detection. In Proceedings of the European Conference on Computer Vision (ECCV), pages 121–136, 2018.