Accurate 4D reconstruction from limited monocular observations
Establish an algorithmic framework that transforms limited monocular observations—specifically single-camera videos or sparse single-view images—into an accurate model of the dynamically changing 3D world (a 4D scene), overcoming the fundamental ambiguity and incompleteness inherent in single-view capture to enable reliable dynamic scene reconstruction and rendering.
References
Transforming this limited information into an accurate model of the dynamically changing 3D world remains an open research challenge, and progress in this space could enable applications in robotics, film-making, video games, and augmented reality.
— CAT4D: Create Anything in 4D with Multi-View Video Diffusion Models
(2411.18613 - Wu et al., 27 Nov 2024) in Section 1 (Introduction)