Infinite-length video generation with perfect fidelity
Establish whether and how to generate infinite-length video sequences with perfect fidelity within the WorldWarp autoregressive pipeline that uses the Spatio-Temporal Diffusion (ST-Diff) model and an online 3D Gaussian Splatting cache, preventing the accumulation and propagation of artifacts and geometric inconsistencies when each generated chunk is used as historical context for subsequent chunks, particularly for sequences exceeding 1000 frames.
Sponsor
References
Although our model is trained in an asynchronous diffusion manner, where we apply varying noise strengths to different frames and spatial regions to mimic inference conditions, generating infinite-length video sequences with perfect fidelity remains an unresolved challenge.
— WorldWarp: Propagating 3D Geometry with Asynchronous Video Diffusion
(2512.19678 - Kong et al., 22 Dec 2025) in Supplementary, Section "Limitations", Error Accumulation in Long-horizon Generation