Principled aggregation of SPIDER’s warp and descriptor matches with incomparable confidences

Develop a principled method to aggregate correspondences produced by SPIDER’s dual heads—the dense warp head and the geometry-aware descriptor head—accounting for the fact that their confidence scores are defined under different objective functions, so that the combined match set is consistent and reliable across challenging scenarios.

Background

SPIDER combines two complementary correspondence heads: a warp head emphasizing pattern-driven 2D matching and a descriptor head emphasizing geometry-driven matching. The authors experimented with feature-level ensembling and guidance-based fusion but found these strategies either interfere with each head’s objectives or limit diversity.

Their current practice samples matches from both heads by confidence, yet the confidences are not directly comparable because each head’s objective differs. The authors explicitly state that determining the best way to aggregate matches under this mismatch in confidence definitions remains an open question.

References

Since the confidence value for each head is with respect to different objective functions, the best way to aggregate matches remain an open question.

SPIDER: Spatial Image CorresponDence Estimator for Robust Calibration (2511.17750 - Shao et al., 21 Nov 2025) in Section 4 (Visualization and Discussion), Limitation