2000 character limit reached
A Deep Learning-based Global and Segmentation-based Semantic Feature Fusion Approach for Indoor Scene Classification (2302.06432v3)
Published 13 Feb 2023 in cs.CV
Abstract: This work proposes a novel approach that uses a semantic segmentation mask to obtain a 2D spatial layout of the segmentation-categories across the scene, designated by segmentation-based semantic features (SSFs). These features represent, per segmentation-category, the pixel count, as well as the 2D average position and respective standard deviation values. Moreover, a two-branch network, GS2F2App, that exploits CNN-based global features extracted from RGB images and the segmentation-based features extracted from the proposed SSFs, is also proposed. GS2F2App was evaluated in two indoor scene benchmark datasets: the SUN RGB-D and the NYU Depth V2, achieving state-of-the-art results on both datasets.
- Z. Yi, T. Chang, S. Li, R. Liu, J. Zhang, and A. Hao, “Scene-Aware Deep Networks for Semantic Segmentation of Images,” IEEE Access, vol. 7, pp. 69 184–69 193, 2019.
- L. Xie, F. Lee, L. Liu, K. Kotani, and Q. Chen, “Scene recognition: A comprehensive survey,” Pattern Recognition, vol. 102, p. 107205, 2020.
- Y. Zhang, H. Chen, K. Yang, J. Zhang, and R. Stiefelhagen, “Perception Framework through Real-Time Semantic Segmentation and Scene Recognition on a Wearable System for the Visually Impaired,” in IEEE International Conference on Real-time Computing and Robotics (RCAR), 2021.
- R. Pereira, A. Cruz, L. Garrote, G. Pires, A. Lopes, and U. J. Nunes, “Dynamic Environment-based Visual User Interface System for Intuitive Navigation Target Selection for Brain-actuated Wheelchairs,” in IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), 2022.
- R. Pereira, L. Garrote, T. Barros, A. Lopes, and U. J. Nunes, “A Deep Learning-based Indoor Scene Classification Approach Enhanced with Inter-Object Distance Semantic Features,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2021.
- A. López-Cifuentes, M. Escudero-Viñolo, J. Bescós, and A. García-Martín, “Semantic-aware scene recognition,” Pattern Recognition, vol. 102, 2020.
- Y. Li, J. Zhang, Y. Cheng, K. Huang, and T. Tan, “DF2Net: Discriminative Feature Learning and Fusion Network for RGB-D Indoor Scene Classification,” AAAI Conference on Artificial Intelligence, 2018.
- X. Song, S. Jiang, B. Wang, C. Chen, and G. Chen, “Image Representations With Spatial Object-to-Object Relations for RGB-D Scene Recognition,” IEEE Transactions on Image Processing, vol. 29, pp. 525–537, 2020.
- Y. Li, Z. Zhang, Y. Cheng, L. Wang, and T. Tan, “MAPNet: Multi-modal attentive pooling network for RGB-D indoor scene classification,” Pattern Recognition, vol. 90, pp. 436–449, 2019.
- L.-C. Chen, Y. Zhu, G. Papandreou, F. Schroff, and H. Adam, “Encoder-decoder with atrous separable convolution for semantic image segmentation,” in European conference on computer vision (ECCV), 2018.
- S. Song, S. P. Lichtenberg, and J. Xiao, “SUN RGB-D: A RGB-D scene understanding benchmark suite,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.
- N. Silberman, D. Hoiem, P. Kohli, and R. Fergus, “Indoor Segmentation and Support Inference from RGB-D Images,” in European Conference on Computer Vision (ECCV), 2012.
- Z. Xiong, Y. Yuan, and Q. Wang, “RGB-D Scene Recognition via Spatial-Related Multi-Modal Feature Learning,” IEEE Access, vol. 7, 2019.
- A. Caglayan, N. Imamoglu, A. B. Can, and R. Nakamura, “When CNNs meet random RNNs: Towards multi-level analysis for RGB-D object and scene recognition,” Computer Vision and Image Understanding, vol. 217, 2022.
- X. Song, C. Chen, and S. Jiang, “RGB-D Scene Recognition with Object-to-Object Relation,” in ACM International Conference on Multimedia, 2017.
- X. Cheng, J. Lu, J. Feng, B. Yuan, and J. Zhou, “Scene recognition with objectness,” Pattern Recognition, vol. 74, pp. 474–487, 2018.
- L. Zhou, J. Cen, X. Wang, Z. Sun, T. L. Lam, and Y. Xu, “BORM: Bayesian Object Relation Model for Indoor Scene Recognition,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2021.
- R. Pereira, N. Gonçalves, L. Garrote, T. Barros, A. Lopes, and U. J. Nunes, “Deep-Learning based Global and Semantic Feature Fusion for Indoor Scene Classification,” in IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC), 2020.
- C. Herranz-Perdiguero, C. Redondo-Cabrera, and R. J. López-Sastre, “In pixels we trust: From Pixel Labeling to Object Localization and Scene Categorization,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2018.
- A. Ahmed, A. Jalal, and K. Kim, “A Novel Statistical Method for Scene Classification Based on Multi-Object Categorization and Logistic Regression,” Sensors, vol. 20, no. 14, 2020.
- K. Simonyan and A. Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition,” in International Conference on Learning Representations (ICLR), 2014.
- K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
- G. Huang, Z. Liu, L. v. d. Maaten, and K. Q. Weinberger, “Densely Connected Convolutional Networks,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
- M. Sandler, A. G. Howard, M. Zhu, A. Zhmoginov, and L. Chen, “MobileNetV2: Inverted Residuals and Linear Bottlenecks,” in IEEE CVPR, 2018.
- S. Xie, R. Girshick, P. Dollár, Z. Tu, and K. He, “Aggregated Residual Transformations for Deep Neural Networks,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
- Z. Liu, H. Mao, C.-Y. Wu, C. Feichtenhofer, T. Darrell, and S. Xie, “A ConvNet for the 2020s,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
- Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo, “Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows,” in IEEE/CVF International Conference on Computer Vision, 2021.
- S. Gupta, P. Arbeláez, and J. Malik, “Perceptual Organization and Recognition of Indoor Scenes from RGB-D Images,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2013.
- W. Zhou, S. Lv, J. Lei, T. Luo, and L. Yu, “RFNet: Reverse Fusion Network With Attention Mechanism for RGB-D Indoor Scene Understanding,” IEEE Transactions on Emerging Topics in Computational Intelligence, pp. 1–6, 2022.
- A. Mosella-Montoro and J. Ruiz-Hidalgo, “2D–3D Geometric Fusion network using Multi-Neighbourhood Graph Convolution for RGB-D indoor scene classification,” Information Fusion, vol. 76, pp. 46–54, 2021.
- Z. Xiong, Y. Yuan, and Q. Wang, “ASK: Adaptively Selecting Key Local Features for RGB-D Scene Recognition,” IEEE Transactions on Image Processing, vol. 30, pp. 2722–2733, 2021.
- A. Ayub and A. R. Wagner, “Centroid based concept learning for RGB-D indoor scene classification,” in British Machine Vision Conference (BMVC), 2020.
- D. Seichter, S. B. Fischedick, M. Köhler, and H.-M. Groß, “Efficient Multi-Task RGB-D Scene Analysis for Indoor Environments,” in International Joint Conference on Neural Networks (IJCNN), 2022.
- D. Du, X. Xu, T. Ren, and G. Wu, “Depth Images Could Tell us More: Enhancing Depth Discriminability for RGB-D Scene Recognition,” in IEEE International Conference on Multimedia and Expo, 2018.
- Ricardo Pereira (4 papers)
- Tiago Barros (13 papers)
- Ana Lopes (3 papers)
- Urbano J. Nunes (8 papers)
- Luis Garrote (1 paper)