Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Deep Learning-based Global and Segmentation-based Semantic Feature Fusion Approach for Indoor Scene Classification (2302.06432v3)

Published 13 Feb 2023 in cs.CV

Abstract: This work proposes a novel approach that uses a semantic segmentation mask to obtain a 2D spatial layout of the segmentation-categories across the scene, designated by segmentation-based semantic features (SSFs). These features represent, per segmentation-category, the pixel count, as well as the 2D average position and respective standard deviation values. Moreover, a two-branch network, GS2F2App, that exploits CNN-based global features extracted from RGB images and the segmentation-based features extracted from the proposed SSFs, is also proposed. GS2F2App was evaluated in two indoor scene benchmark datasets: the SUN RGB-D and the NYU Depth V2, achieving state-of-the-art results on both datasets.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (34)
  1. Z. Yi, T. Chang, S. Li, R. Liu, J. Zhang, and A. Hao, “Scene-Aware Deep Networks for Semantic Segmentation of Images,” IEEE Access, vol. 7, pp. 69 184–69 193, 2019.
  2. L. Xie, F. Lee, L. Liu, K. Kotani, and Q. Chen, “Scene recognition: A comprehensive survey,” Pattern Recognition, vol. 102, p. 107205, 2020.
  3. Y. Zhang, H. Chen, K. Yang, J. Zhang, and R. Stiefelhagen, “Perception Framework through Real-Time Semantic Segmentation and Scene Recognition on a Wearable System for the Visually Impaired,” in IEEE International Conference on Real-time Computing and Robotics (RCAR), 2021.
  4. R. Pereira, A. Cruz, L. Garrote, G. Pires, A. Lopes, and U. J. Nunes, “Dynamic Environment-based Visual User Interface System for Intuitive Navigation Target Selection for Brain-actuated Wheelchairs,” in IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), 2022.
  5. R. Pereira, L. Garrote, T. Barros, A. Lopes, and U. J. Nunes, “A Deep Learning-based Indoor Scene Classification Approach Enhanced with Inter-Object Distance Semantic Features,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2021.
  6. A. López-Cifuentes, M. Escudero-Viñolo, J. Bescós, and A. García-Martín, “Semantic-aware scene recognition,” Pattern Recognition, vol. 102, 2020.
  7. Y. Li, J. Zhang, Y. Cheng, K. Huang, and T. Tan, “DF2Net: Discriminative Feature Learning and Fusion Network for RGB-D Indoor Scene Classification,” AAAI Conference on Artificial Intelligence, 2018.
  8. X. Song, S. Jiang, B. Wang, C. Chen, and G. Chen, “Image Representations With Spatial Object-to-Object Relations for RGB-D Scene Recognition,” IEEE Transactions on Image Processing, vol. 29, pp. 525–537, 2020.
  9. Y. Li, Z. Zhang, Y. Cheng, L. Wang, and T. Tan, “MAPNet: Multi-modal attentive pooling network for RGB-D indoor scene classification,” Pattern Recognition, vol. 90, pp. 436–449, 2019.
  10. L.-C. Chen, Y. Zhu, G. Papandreou, F. Schroff, and H. Adam, “Encoder-decoder with atrous separable convolution for semantic image segmentation,” in European conference on computer vision (ECCV), 2018.
  11. S. Song, S. P. Lichtenberg, and J. Xiao, “SUN RGB-D: A RGB-D scene understanding benchmark suite,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.
  12. N. Silberman, D. Hoiem, P. Kohli, and R. Fergus, “Indoor Segmentation and Support Inference from RGB-D Images,” in European Conference on Computer Vision (ECCV), 2012.
  13. Z. Xiong, Y. Yuan, and Q. Wang, “RGB-D Scene Recognition via Spatial-Related Multi-Modal Feature Learning,” IEEE Access, vol. 7, 2019.
  14. A. Caglayan, N. Imamoglu, A. B. Can, and R. Nakamura, “When CNNs meet random RNNs: Towards multi-level analysis for RGB-D object and scene recognition,” Computer Vision and Image Understanding, vol. 217, 2022.
  15. X. Song, C. Chen, and S. Jiang, “RGB-D Scene Recognition with Object-to-Object Relation,” in ACM International Conference on Multimedia, 2017.
  16. X. Cheng, J. Lu, J. Feng, B. Yuan, and J. Zhou, “Scene recognition with objectness,” Pattern Recognition, vol. 74, pp. 474–487, 2018.
  17. L. Zhou, J. Cen, X. Wang, Z. Sun, T. L. Lam, and Y. Xu, “BORM: Bayesian Object Relation Model for Indoor Scene Recognition,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2021.
  18. R. Pereira, N. Gonçalves, L. Garrote, T. Barros, A. Lopes, and U. J. Nunes, “Deep-Learning based Global and Semantic Feature Fusion for Indoor Scene Classification,” in IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC), 2020.
  19. C. Herranz-Perdiguero, C. Redondo-Cabrera, and R. J. López-Sastre, “In pixels we trust: From Pixel Labeling to Object Localization and Scene Categorization,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2018.
  20. A. Ahmed, A. Jalal, and K. Kim, “A Novel Statistical Method for Scene Classification Based on Multi-Object Categorization and Logistic Regression,” Sensors, vol. 20, no. 14, 2020.
  21. K. Simonyan and A. Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition,” in International Conference on Learning Representations (ICLR), 2014.
  22. K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
  23. G. Huang, Z. Liu, L. v. d. Maaten, and K. Q. Weinberger, “Densely Connected Convolutional Networks,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
  24. M. Sandler, A. G. Howard, M. Zhu, A. Zhmoginov, and L. Chen, “MobileNetV2: Inverted Residuals and Linear Bottlenecks,” in IEEE CVPR, 2018.
  25. S. Xie, R. Girshick, P. Dollár, Z. Tu, and K. He, “Aggregated Residual Transformations for Deep Neural Networks,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
  26. Z. Liu, H. Mao, C.-Y. Wu, C. Feichtenhofer, T. Darrell, and S. Xie, “A ConvNet for the 2020s,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
  27. Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo, “Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows,” in IEEE/CVF International Conference on Computer Vision, 2021.
  28. S. Gupta, P. Arbeláez, and J. Malik, “Perceptual Organization and Recognition of Indoor Scenes from RGB-D Images,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2013.
  29. W. Zhou, S. Lv, J. Lei, T. Luo, and L. Yu, “RFNet: Reverse Fusion Network With Attention Mechanism for RGB-D Indoor Scene Understanding,” IEEE Transactions on Emerging Topics in Computational Intelligence, pp. 1–6, 2022.
  30. A. Mosella-Montoro and J. Ruiz-Hidalgo, “2D–3D Geometric Fusion network using Multi-Neighbourhood Graph Convolution for RGB-D indoor scene classification,” Information Fusion, vol. 76, pp. 46–54, 2021.
  31. Z. Xiong, Y. Yuan, and Q. Wang, “ASK: Adaptively Selecting Key Local Features for RGB-D Scene Recognition,” IEEE Transactions on Image Processing, vol. 30, pp. 2722–2733, 2021.
  32. A. Ayub and A. R. Wagner, “Centroid based concept learning for RGB-D indoor scene classification,” in British Machine Vision Conference (BMVC), 2020.
  33. D. Seichter, S. B. Fischedick, M. Köhler, and H.-M. Groß, “Efficient Multi-Task RGB-D Scene Analysis for Indoor Environments,” in International Joint Conference on Neural Networks (IJCNN), 2022.
  34. D. Du, X. Xu, T. Ren, and G. Wu, “Depth Images Could Tell us More: Enhancing Depth Discriminability for RGB-D Scene Recognition,” in IEEE International Conference on Multimedia and Expo, 2018.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Ricardo Pereira (4 papers)
  2. Tiago Barros (13 papers)
  3. Ana Lopes (3 papers)
  4. Urbano J. Nunes (8 papers)
  5. Luis Garrote (1 paper)
Citations (5)

Summary

We haven't generated a summary for this paper yet.