Ability of diffusion models (discrete and continuous-on-discrete) to match autoregressive models

Determine whether discrete diffusion models and continuous diffusion models applied to discrete data can match the performance of standard autoregressive models on discrete sequence and image generation tasks under comparable training and evaluation protocols.

Background

Despite progress in discrete diffusion and in continuous diffusion methods applied to discrete data, standard autoregressive models remain strong baselines in many discrete-data domains. The relative performance gap between diffusion-based approaches and autoregressive models remains an open issue.

Establishing whether diffusion-based approaches can match autoregressive performance is crucial for understanding the trade-offs of parallel sampling and distillation methods versus the typically strong likelihood-based training and inference of autoregressive models.

References

Furthermore, for both model classes it remains to be seen whether they can match the performance of standard autoregressive models.

Beyond Single Tokens: Distilling Discrete Diffusion Models via Discrete MMD  (2603.20155 - Hoogeboom et al., 20 Mar 2026) in Section 4 (Related work), Deterministic Diffusion Distillation