Benefit of geometry-grounding in distilled radiance fields
Determine whether incorporating geometry-grounded semantic features—such as those produced by vision backbones trained with 3D reconstruction objectives (e.g., Visual Geometric Grounded Transformer, VGGT)—provides measurable advantages over visual-only semantic features (e.g., DINO and CLIP) when distilling semantics into radiance fields (Gaussian Splatting and neural radiance fields).
References
While prior work has demonstrated the effectiveness of visual-only semantic features (e.g., DINO and CLIP) in Gaussian Splatting and neural radiance fields, the potential benefit of geometry-grounding in distilled fields remains an open question.
— Geometry Meets Vision: Revisiting Pretrained Semantics in Distilled Fields
(2510.03104 - Mei et al., 3 Oct 2025) in Abstract