Geometry-aware Feature Matching for Large-Scale Structure from Motion (2409.02310v3)

Published 3 Sep 2024 in cs.CV

Abstract: Establishing consistent and dense correspondences across multiple images is crucial for Structure from Motion (SfM) systems. Significant view changes, such as air-to-ground with very sparse view overlap, pose an even greater challenge to the correspondence solvers. We present a novel optimization-based approach that significantly enhances existing feature matching methods by introducing geometry cues in addition to color cues. This helps fill gaps when there is less overlap in large-scale scenarios. Our method formulates geometric verification as an optimization problem, guiding feature matching within detector-free methods and using sparse correspondences from detector-based methods as anchor points. By enforcing geometric constraints via the Sampson Distance, our approach ensures that the denser correspondences from detector-free methods are geometrically consistent and more accurate. This hybrid strategy significantly improves correspondence density and accuracy, mitigates multi-view inconsistencies, and leads to notable advancements in camera pose accuracy and point cloud density. It outperforms state-of-the-art feature matching methods on benchmark datasets and enables feature matching in challenging extreme large-scale settings.

References (49)

Authors (8)

Gonglin Chen (3 papers)
Jinsen Wu (2 papers)
Haiwei Chen (8 papers)
Wenbin Teng (5 papers)
Zhiyuan Gao (5 papers)
Andrew Feng (27 papers)
Rongjun Qin (47 papers)
Yajie Zhao (22 papers)

Summary

Geometry-aware Feature Matching for Large-Scale Structure from Motion

This paper, "Geometry-aware Feature Matching for Large-Scale Structure from Motion", presents an innovative approach to address the challenges in establishing consistent and dense correspondences in Structure from Motion (SfM) systems, particularly in scenarios involving significant viewpoint changes such as air-to-ground imagery with sparse view overlap.

Key Contributions

The authors introduce a novel optimization-based method that significantly enhances the performance of existing feature matching techniques by incorporating geometric cues alongside traditional color cues. The core of the method lies in a geometry-aware optimization module that leverages the Sampson Distance to enforce geometric consistency, refining dense correspondences iteratively. The method uses sparse correspondences derived from detector-based methods as anchor points, guiding the matching process within detector-free frameworks.

Methodology

The proposed method integrates both detector-based and detector-free feature matching approaches:

Detector-based Methods: These methods, exemplified by SuperPoint and SuperGlue, identify keypoints and descriptors independently before matching them. They excel in scenarios with richly textured images and small viewpoint changes but falter under extreme conditions.
Detector-free Methods: Methods like LoFTR, ASpanFormer, and MatchFormer perform dense matching in a single step using end-to-end training with neural networks. While they provide denser correspondences, they lack control over the consistency of keypoints across multiple views.

The method formulates geometric verification as an optimization problem. By integrating sparse correspondences as geometric priors and enforcing geometric constraints via the Sampson Distance, the method iteratively refines and reassigns correspondences to ensure geometric consistency. This hybrid strategy combines the strengths of both approaches, offering improved correspondence density and accuracy and mitigating multi-view inconsistencies.

Experimental Results

The method has been evaluated on publicly available datasets, including the Image Matching Competition Benchmark and MegaDepth, as well as two specially collected air-to-ground datasets. The results demonstrate that:

Pose Estimation: The method achieves superior accuracy in camera pose estimation compared to state-of-the-art methods. For instance, on the IMC Phototourism benchmark, the method achieves an AUC of 90.1@ $10\degree$ , outperforming SuperPoint + SuperGlue and ALIKED + LightGlue.
Air-to-Ground Reconstruction: The method successfully registers all images and aligns UAV images with ground images, producing superior 3D models even in challenging large-scale scenarios.

Implications and Future Work

The implications of this research are significant for the field of computer vision, particularly in applications involving large-scale and challenging SfM scenarios. The method's ability to bridge detector-based and detector-free approaches could inspire further research and development in hybrid feature matching techniques.

Practically, the method could enhance various applications such as aerial mapping, autonomous navigation, and augmented reality by providing more accurate and robust 3D reconstructions from diverse and challenging datasets.

Future developments could focus on improving the efficiency of the algorithm, potentially through the integration of more computationally efficient backbone models or the application of multi-view refinement techniques. Additionally, the approach could be extended to other domains requiring robust feature matching under varying conditions.

Conclusion

The "Geometry-aware Feature Matching for Large-Scale Structure from Motion" paper presents a robust method for enhancing feature matching through geometric consistency, demonstrating significant improvements in SfM reconstruction accuracy. This method's hybrid approach effectively addresses the challenges posed by extreme viewpoint changes and sparse view overlap, marking a notable advancement in the field of computer vision.

PDF Markdown

Related Papers

Tweets

https://twitter.com/ducha_aiki/status/1831979900976394618