Geometry-aware Feature Matching for Large-Scale Structure from Motion (2409.02310v3)
Abstract: Establishing consistent and dense correspondences across multiple images is crucial for Structure from Motion (SfM) systems. Significant view changes, such as air-to-ground with very sparse view overlap, pose an even greater challenge to the correspondence solvers. We present a novel optimization-based approach that significantly enhances existing feature matching methods by introducing geometry cues in addition to color cues. This helps fill gaps when there is less overlap in large-scale scenarios. Our method formulates geometric verification as an optimization problem, guiding feature matching within detector-free methods and using sparse correspondences from detector-based methods as anchor points. By enforcing geometric constraints via the Sampson Distance, our approach ensures that the denser correspondences from detector-free methods are geometrically consistent and more accurate. This hybrid strategy significantly improves correspondence density and accuracy, mitigates multi-view inconsistencies, and leads to notable advancements in camera pose accuracy and point cloud density. It outperforms state-of-the-art feature matching methods on benchmark datasets and enables feature matching in challenging extreme large-scale settings.
- Building rome in a day. In 2009 IEEE 12th International Conference on Computer Vision, pages 72–79, 2009.
- Hpatches: A benchmark and evaluation of handcrafted and learned local descriptors. In CVPR, 2017.
- Speeded-up robust features (surf). CVIU, 110(3):346–359, 2008.
- 3d model acquisition from extended image sequences. In Computer Vision — ECCV ’96, pages 683–695, Berlin, Heidelberg, 1996. Springer Berlin Heidelberg.
- Aspanformer: Detector-free image matching with adaptive span transformer. ECCV, 2022.
- Universal correspondence network. NeurIPS, 2016.
- Scannet: Richly-annotated 3d reconstructions of indoor scenes, 2017.
- Monoslam: Real-time single camera slam. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(6):1052–1067, 2007.
- Superpoint: Self-supervised interest point detection and description. CVPR Workshops, pages 224–236, 2018.
- DKM: Dense kernelized feature matching for geometry estimation. In IEEE Conference on Computer Vision and Pattern Recognition, 2023.
- RoMa: Robust Dense Feature Matching. IEEE Conference on Computer Vision and Pattern Recognition, 2024.
- Stereoscan: Dense 3d reconstruction in real-time. In 2011 IEEE Intelligent Vehicles Symposium (IV), pages 963–968, 2011.
- Multiple View Geometry in Computer Vision. Cambridge University Press, New York, NY, USA, 2 edition, 2003.
- Adaptive assignment for geometry aware local feature matching. CVPR, 2023.
- Image Matching across Wide Baselines: From Paper to Practice. International Journal of Computer Vision, 2020.
- 3d gaussian splatting for real-time radiance field rendering. ACM Transactions on Graphics, 42(4), 2023a.
- 3d gaussian splatting for real-time radiance field rendering, 2023b.
- Densegap: Graph-structured dense correspondence learning with anchor points. In 2022 26th International Conference on Pattern Recognition (ICPR), 2022.
- Dual-resolution correspondence networks. NeurIPS, 2020.
- MegaDepth: Learning single-view depth prediction from internet photos. In CVPR, 2018.
- Pixel-Perfect Structure-from-Motion with Featuremetric Refinement. In ICCV, 2021.
- LightGlue: Local Feature Matching at Light Speed. In ICCV, 2023.
- SIFT Flow: Dense correspondence across scenes and its applications. T-PAMI, 2010.
- David G. Lowe. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision, 60(2):91–110, 2004.
- Nerf: Representing scenes as neural radiance fields for view synthesis, 2020.
- Keypoint detection in rgbd images based on an anisotropic scale space. IEEE Transactions on Multimedia, 18(9):1762–1771, 2016.
- Relative 3d reconstruction using multiple uncalibrated images. The International Journal of Robotics Research, 14(6):619–632, 1995.
- Real time localization and 3d reconstruction. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), pages 363–370, 2006.
- Dtam: Dense tracking and mapping in real-time. In 2011 International Conference on Computer Vision, pages 2320–2327, 2011.
- Dinov2: Learning robust visual features without supervision, 2024.
- Diffposenet: Direct differentiable camera pose estimation, 2022.
- Fast and accurate camera covariance computation for large 3d reconstruction, 2018.
- Neighbourhood consensus networks. NeurIPS, 2018.
- Efficient neighbourhood consensus networks via submanifold sparse convolutions. ECCV, 2020.
- Orb: An efficient alternative to sift or surf. ICCV, 2011.
- Slam++: Simultaneous localisation and mapping at the level of objects. In 2013 IEEE Conference on Computer Vision and Pattern Recognition, pages 1352–1359, 2013.
- SuperGlue: Learning feature matching with graph neural networks. In CVPR, 2020.
- Self-supervised visual descriptor learning for dense correspondence. RAL, 2016.
- Structure-from-motion revisited. In Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
- Pixelwise view selection for unstructured multi-view stereo. In European Conference on Computer Vision (ECCV), 2016.
- LoFTR: Detector-free local feature matching with transformers. CVPR, 2021.
- Sfm-net: Learning of structure and motion from video, 2017.
- Matchformer: Interleaving attention in transformers for feature matching. In Asian Conference on Computer Vision, 2022.
- Efficient LoFTR: Semi-dense local feature matching with sparse-like speed. In CVPR, 2024.
- Lift: Learned invariant feature transform. ECCV, 2016.
- Ds-slam: A semantic visual slam towards dynamic environments. In 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 1168–1174, 2018.
- Relpose: Predicting probabilistic relative rotation for single objects in the wild, 2022.
- Aliked: A lighter keypoint and descriptor extraction network via deformable transformation, 2023.
- Patch2pix: Epipolar-guided pixel-level correspondences. In CVPR, 2021.
- Gonglin Chen (3 papers)
- Jinsen Wu (2 papers)
- Haiwei Chen (8 papers)
- Wenbin Teng (5 papers)
- Zhiyuan Gao (5 papers)
- Andrew Feng (27 papers)
- Rongjun Qin (47 papers)
- Yajie Zhao (22 papers)