Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
149 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SG-PGM: Partial Graph Matching Network with Semantic Geometric Fusion for 3D Scene Graph Alignment and Its Downstream Tasks (2403.19474v1)

Published 28 Mar 2024 in cs.CV and cs.RO

Abstract: Scene graphs have been recently introduced into 3D spatial understanding as a comprehensive representation of the scene. The alignment between 3D scene graphs is the first step of many downstream tasks such as scene graph aided point cloud registration, mosaicking, overlap checking, and robot navigation. In this work, we treat 3D scene graph alignment as a partial graph-matching problem and propose to solve it with a graph neural network. We reuse the geometric features learned by a point cloud registration method and associate the clustered point-level geometric features with the node-level semantic feature via our designed feature fusion module. Partial matching is enabled by using a learnable method to select the top-k similar node pairs. Subsequent downstream tasks such as point cloud registration are achieved by running a pre-trained registration network within the matched regions. We further propose a point-matching rescoring method, that uses the node-wise alignment of the 3D scene graph to reweight the matching candidates from a pre-trained point cloud registration method. It reduces the false point correspondences estimated especially in low-overlapping cases. Experiments show that our method improves the alignment accuracy by 10~20% in low-overlap and random transformation scenarios and outperforms the existing work in multiple downstream tasks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (56)
  1. Pointnetlk: Robust & efficient point cloud registration using pointnet. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
  2. 3d scene graph: A structure for unified semantics, 3d space, and camera. In Proceedings of the IEEE International Conference on Computer Vision, pages 5664–5673, 2019.
  3. Simgnn: A neural network approach to fast graph similarity computation. In Proceedings of the twelfth ACM international conference on web search and data mining, pages 384–392, 2019.
  4. Graph-cut RANSAC. In Conference on Computer Vision and Pattern Recognition, 2018a.
  5. Graph-cut ransac. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 6733–6741, 2018b.
  6. Method for registration of 3-d shapes. In Sensor fusion IV: control paradigms and data structures, pages 586–606. Spie, 1992.
  7. How attentive are graph attention networks? In International Conference on Learning Representations, 2022.
  8. Fully convolutional geometric features. In Proceedings of the IEEE/CVF international conference on computer vision, pages 8958–8966, 2019.
  9. Graph-to-3d: End-to-end generation and manipulation of 3d scenes using scene graphs. In IEEE International Conference on Computer Vision (ICCV), 2021.
  10. Free-form description guided 3d visual graph network for object grounding in point cloud. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3722–3731, 2021.
  11. Deep graph matching consensus. In International Conference on Learning Representations, 2019.
  12. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6):381–395, 1981.
  13. Robust point cloud registration framework based on deep graph matching. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8893–8902, 2021.
  14. X-view: Graph-based semantic multi-view localization. IEEE Robotics and Automation Letters, 3(3):1687–1694, 2018.
  15. Learnable graph matching: Incorporating graph partitioning with deep feature learning for multiple object tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5299–5309, 2021.
  16. Predator: Registration of 3d point clouds with low overlap. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2021.
  17. Hydra: A real-time spatial perception system for 3d scene graph construction and optimization. arXiv preprint arXiv:2201.13360, 2022.
  18. Computing optimal assignments in linear time for approximate graph matching. In 2019 IEEE International Conference on Data Mining (ICDM), pages 349–358. IEEE, 2019.
  19. Visual genome: Connecting language and vision using crowdsourced dense image annotations. International journal of computer vision, 123(1):32–73, 2017.
  20. Sub-gmn: The subgraph matching network model. arXiv preprint arXiv:2104.00186, 2021.
  21. Deep probabilistic graph matching. arXiv preprint arXiv:2201.01603, 2022.
  22. Stochastic iterative graph matching. In International Conference on Machine Learning, pages 6815–6825. PMLR, 2021.
  23. Global localization with object-level semantics and topology. In 2019 International Conference on Robotics and Automation (ICRA), pages 4909–4915. IEEE, 2019.
  24. Neural subgraph matching. arXiv preprint arXiv:2007.03092, 2020.
  25. An iterative image registration technique with an application to stereo vision. In IJCAI’81: 7th international joint conference on Artificial intelligence, pages 674–679, 1981.
  26. Partial order embedding with multiple kernels. In Proceedings of the 26th Annual International Conference on Machine Learning, pages 721–728, 2009.
  27. Learning latent permutations with gumbel-sinkhorn networks. In ICLR 2018 Conference Track. OpenReview, 2018.
  28. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 652–660, 2017.
  29. Towards accurate loop closure detection in semantic slam with 3d semantic covisibility graphs. IEEE Robotics and Automation Letters, 7(2):2455–2462, 2022.
  30. Objects matter: Learning object relation graph for robust camera relocalization. arXiv preprint arXiv:2205.13280, 2022.
  31. Geometric transformer for fast and robust point cloud registration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11143–11152, 2022.
  32. Kimera: From slam to spatial perception with 3d dynamic scene graphs. The International Journal of Robotics Research, 40(12-14):1510–1546, 2021.
  33. Interpretable neural subgraph matching for graph retrieval. AAAI 2022, 2022.
  34. Sgaligner : 3d scene alignment with scene graphs. Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2023.
  35. SuperGlue: Learning feature matching with graph neural networks. In CVPR, 2020.
  36. Richard Sinkhorn. A relationship between arbitrary positive matrices and doubly stochastic matrices. The annals of mathematical statistics, 35(2):876–879, 1964.
  37. Retargetable ar: Context-aware augmented reality in indoor scenes based on 3d scene graph. In 2020 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct), pages 249–255. IEEE, 2020.
  38. Kpconv: Flexible and deformable convolution for point clouds. In Proceedings of the IEEE/CVF international conference on computer vision, pages 6411–6420, 2019.
  39. Attention is all you need. Advances in neural information processing systems, 30, 2017.
  40. Rio: 3d object instance re-localization in changing indoor environments. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 7658–7667, 2019.
  41. Learning 3d semantic scene graphs from 3d indoor reconstructions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3961–3970, 2020.
  42. Object-based reliable visual navigation for mobile robot. Sensors, 22(6):2387, 2022.
  43. Learning combinatorial embedding networks for deep graph matching. In Proceedings of the IEEE/CVF international conference on computer vision, pages 3056–3065, 2019a.
  44. Deep learning of partial graph matching via differentiable top-k. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6272–6281, 2023.
  45. Dynamic graph cnn for learning on point clouds. Acm Transactions On Graphics (tog), 38(5):1–12, 2019b.
  46. Scenegraphfusion: Incremental 3d scene graph prediction from rgb-d sequences. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7515–7525, 2021.
  47. Incremental 3d semantic scene graph prediction from rgb sequences. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5064–5074, 2023.
  48. Omnet: Learning overlapping mask for partial-to-partial point cloud registration. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 3132–3141, 2021a.
  49. Optimization of graph neural networks: Implicit acceleration by skip connections and more depth. In International Conference on Machine Learning, pages 11592–11602. PMLR, 2021b.
  50. Cofinet: Reliable coarse-to-fine correspondences for robust pointcloud registration. Advances in Neural Information Processing Systems, 34:23872–23884, 2021.
  51. Peal: Prior-embedded explicit attention learning for low-overlap point cloud registration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 17702–17711, 2023.
  52. Deep learning of graph matching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
  53. 3dmatch: Learning local geometric descriptors from rgb-d reconstructions. In CVPR, 2017.
  54. Knowledge-inspired 3d scene graph prediction in point cloud. Advances in Neural Information Processing Systems, 34:18620–18632, 2021.
  55. Soon: Scenario oriented object navigation with graph-based exploration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12689–12699, 2021.
  56. Leveraging inlier correspondences proportion for point cloud registration. arXiv preprint arXiv:2201.12094, 2022.

Summary

We haven't generated a summary for this paper yet.