Applicability of continuous-time consistency models (sCM) to large-scale text-to-image and video diffusion
Determine whether continuous-time consistency models (sCM) can be practically applied to large-scale text-to-image and text-to-video diffusion models, given infrastructure challenges in Jacobian–vector product computation and the limitations of standard evaluation benchmarks, and ascertain the conditions under which such applicability holds.
References
Although continuous-time consistency model (sCM) is theoretically principled and empirically powerful for accelerating academic-scale diffusion, its applicability to large-scale text-to-image and video tasks remains unclear due to infrastructure challenges in Jacobian–vector product (JVP) computation and the limitations of standard evaluation benchmarks.
— Large Scale Diffusion Distillation via Score-Regularized Continuous-Time Consistency
(2510.08431 - Zheng et al., 9 Oct 2025) in Abstract