Cause of domain-dependent speed–accuracy tradeoff in LLaDA2.1
Determine whether the observed domain-specific variation in LLaDA2.1 decoding performance—where Speedy Mode threshold settings yield high throughput with minimal accuracy loss in structured domains such as coding and math but degrade quality in general chat—stems primarily from an inherent model preference for structured data or from distributional characteristics of the training dataset.
References
Our conjecture is that this pattern may be related to the model's inherent preference for structured data or the distributional characteristics of training dataset.
— LLaDA2.1: Speeding Up Text Diffusion via Token Editing
(2602.08676 - Bie et al., 9 Feb 2026) in Outlook and Limitation — Tradeoff Between Inference Speed and Accuracy paragraph