Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

AEGIS-Net: Attention-guided Multi-Level Feature Aggregation for Indoor Place Recognition (2312.09538v1)

Published 15 Dec 2023 in cs.CV and cs.RO

Abstract: We present AEGIS-Net, a novel indoor place recognition model that takes in RGB point clouds and generates global place descriptors by aggregating lower-level color, geometry features and higher-level implicit semantic features. However, rather than simple feature concatenation, self-attention modules are employed to select the most important local features that best describe an indoor place. Our AEGIS-Net is made of a semantic encoder, a semantic decoder and an attention-guided feature embedding. The model is trained in a 2-stage process with the first stage focusing on an auxiliary semantic segmentation task and the second one on the place recognition task. We evaluate our AEGIS-Net on the ScanNetPR dataset and compare its performance with a pre-deep-learning feature-based method and five state-of-the-art deep-learning-based methods. Our AEGIS-Net achieves exceptional performance and outperforms all six methods.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (18)
  1. R. Arandjelović, P. Gronat, A. Torii, T. Pajdla and J. Sivic, “NetVLAD: CNN Architecture for Weakly Supervised Place Recognition,” in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 6, pp. 1437-1451, 2018.
  2. M. A. Uy and G. H. Lee, “PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4470-4479, 2018.
  3. S. Hausler, S. Garg, M. Xu, M. Milford and T. Fischer, “Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14141-14152, 2021.
  4. J. Komorowski, “MinkLoc3D: Point Cloud Based Large-Scale Place Recognition,” in IEEE Winter Conference on Applications of Computer Vision, pp. 1789-1798, 2021.
  5. S. Garg, N. Suenderhauf and M. Milford, “Semantic–geometric visual place recognition: a new perspective for reconciling opposing views,” in The International Journal of Robotics Research, vol. 41, no. 6, pp. 573-598, 2022.
  6. W. Maddern, G. Pascoe, C. Linegar and P. Newman, “1 Year, 1000 Km: The Oxford RobotCar Dataset,”, in International Journal of Robotics Research, vol. 36, no. 1, pp. 3–15, 2017.
  7. M. Måns Larsson, E. Stenborg, L. Hammarstrand, M. Pollefeys, T. Sattler and F. Kahl, “A Cross-Season Correspondence Dataset for Robust Semantic Segmentation,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9524-9534, 2019.
  8. X. Yang, Y. Ming and A. Calway, “FD-SLAM: 3-D Reconstruction Using Features and Dense Matching,” in IEEE International Conference on Robotics and Automation, 2022.
  9. J. Du, R. Wang and D. Cremers, “DH3D: Deep Hierarchical 3D Descriptors for Robust Large-Scale 6DoF Relocalization,” in European Conference on Computer Vision, 2020.
  10. F. Taubner, F. Tschopp, T. Novkovic, R. Siegwart and F. Furrer, “LCD – Line Clustering and Description for Place Recognition,” in International Conference on 3D Vision, pp. 908-917, 2020.
  11. M. Y. Chang, S. Yeon, S. Ryu and D. Lee, “SpoxelNet: Spherical Voxel-based Deep Place Recognition for 3D Point Clouds of Crowded Indoor Spaces,” in IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 8564-8570, 2020.
  12. Y. Ming, X. Yang, G. Zhang and A. Calway, “CGiS-Net: Aggregating Colour, Geometry and Implicit Semantic Features for Indoor Place Recognition,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 6991-6997, 2022.
  13. J. L. Schönberger, M. Pollefeys, A. Geiger and T. Sattler, “Semantic Visual Localization” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6896-6906, 2018.
  14. D. G. Lowe, “Distinctive Image Features from Scale-Invariant Keypoints,” in International Journal of Computer Vision, vol 60, no. 2, pp. 91-110, 2004.
  15. J. Sivic and A. Zisserman, “Efficient Visual Search of Videos Cast as Text Retrieval,” in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, no. 4, pp. 591-606, 2009.
  16. B. Ramtoula, R. de Azambuja and G. Beltrame, “CAPRICORN: Communication Aware Place Recognition Using Interpretable Constellations of Objects in Robot Networks,” in IEEE International Conference on Robotics and Automation, pp. 8761-8768, 2020.
  17. Y. Ming, X. Yang and A. Calway, “Object-Augmented RGB-D SLAM for Wide-Disparity Relocalisation,” in IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 2180-2186, 2021.
  18. D. P. Kingma and J. Ba, “Adam: A Method for Stochastic Optimization,” in International Conference on Learning Representations, 2015.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com