Mechanism of Epipolar Geometry Recovery in VGGT
Determine the internal mechanisms by which the Visual Geometry Grounded Transformer (VGGT) recovers epipolar geometric information in its intermediate layers, identifying how the model organizes representations to yield fundamental matrix–consistent relationships across views.
Sponsor
References
Yet, so far, we do not know how the model recovers this information.
— On Geometric Understanding and Learned Data Priors in VGGT
(2512.11508 - Bratulić et al., 12 Dec 2025) in Section 4.2 (How do attention maps encode correspondences?)