Handling Ambiguity in Cross-View, Cross-Modality Correspondence
Determine algorithmic strategies to robustly handle ambiguity when predicting pixel-level correspondences between ground-level photographs and floor plans in cross-view, cross-modality settings, particularly in cases where photos provide minimal contextual cues or where scene layouts exhibit structural symmetry that leads to multiple plausible alignments.
Sponsor
References
We analyze the remaining errors and find multiple challenges that are particular to this cross-view, cross-modal problem: often, ground-level photos do not provide enough context of the overall scene, and this problem is exacerbated when symmetries in the structure make the problem ambiguous. Handling this ambiguity is an open problem deserving of future research.
— C3Po: Cross-View Cross-Modality Correspondence by Pointmap Prediction
(2511.18559 - Huang et al., 23 Nov 2025) in Section 1 (Introduction)