In-context temporal consistency capability of video diffusion models
Ascertain whether current diffusion-based video generation models exhibit in-context learning capabilities for temporal consistency tasks that are comparable to the established in-context generation capabilities of text-to-image diffusion models.
References
However, it remains unclear whether current video diffusion models exhibit comparable in-context capabilities for temporal consistency tasks.
— OmniTransfer: All-in-one Framework for Spatio-temporal Video Transfer
(2601.14250 - Zhang et al., 20 Jan 2026) in Section 4.2 (Task-aware Positional Bias)