Exploration of quantization bit-widths in Ditto

Investigate alternative quantization bit-widths beyond the 3-bit default for the clustering-based weight quantization component within Ditto, the proposed framework for compiling Code LLMs into lightweight executables.

Background

In the experimental setup, the authors fix Ditto’s quantization to 3 bits as the default across all evaluated models to standardize comparisons. This choice balances memory savings with accuracy in their main experiments.

The authors explicitly note that examining other bit-width configurations is deferred, indicating that a systematic assessment of different bit-widths within Ditto’s clustering and bit-packing scheme remains to be conducted beyond the default setting.

References

We leave the exploration of different bitwidths to future work.

Compiling Code LLMs into Lightweight Executables  (2603.29813 - Shi et al., 31 Mar 2026) in Section 4.1, Experimental Setup (Comparisons)