Zero-shot stereo matching with real-time efficiency

Develop stereo matching algorithms that simultaneously achieve strong zero-shot generalization across unseen domains while maintaining real-time inference efficiency, identifying concrete architectural and training strategies that realize this combination.

Background

The literature distinguishes accuracy-oriented stereo models, which often rely on large and computationally intensive architectures (and sometimes foundation-model priors) to obtain strong zero-shot generalization, from efficiency-oriented models that prioritize speed and resource use but typically suffer a significant accuracy gap and limited generalization. As a result, practical deployment on resource-constrained hardware frequently requires domain-specific fine-tuning, undermining zero-shot capability.

Within the survey of efficient stereo methods, the authors explicitly state that achieving strong zero-shot performance while preserving real-time efficiency remains unresolved. This motivates their proposed Lite Any Stereo architecture and training strategy aimed at bridging this divide, but the general challenge of reliably attaining both properties concurrently is highlighted as open.

References

Achieving strong zero-shot ability while maintaining real-time efficiency remains an open challenge.

Lite Any Stereo: Efficient Zero-Shot Stereo Matching (2511.16555 - Jing et al., 20 Nov 2025) in Section 2.2 Efficient Stereo Matching (Related Work)