Compression with Dead-zone Quantizer (CoDeQ) is a unified methodology that integrates dead-zone scalar quantization to perform both magnitude pruning and uniform quantization in a single, differentiable operator.
It enables explicit rate-distortion and sparsity control by tuning quantizer parameters such as step size, offset, and dead-zone width, thereby optimizing performance across image and model compression tasks.
Empirical results show that CoDeQ matches the rate-distortion performance of state-of-the-art codecs and achieves competitive neural network accuracies with reduced bit operations and enhanced sparsity.
Compression with Dead-zone Quantizer (CoDeQ) represents a unified methodology for signal and model compression that leverages the properties of dead-zone scalar quantization. Dead-zone quantizers introduce adjustable regions around zero where all values are mapped to zero, effecting both quantization and magnitude pruning in a single operator. This principle enables efficient image compression and supports highly sparse, low-precision neural networks in a fully differentiable, end-to-end optimization procedure. CoDeQ, as formalized in (Zhou et al., 2020) and further generalized for model compression in (Wenshøj et al., 15 Dec 2025), provides explicit control over rate-distortion tradeoffs and network sparsity by varying quantizer parameters, with provable near-optimal performance across a spectrum of applications.
1. Dead-Zone Quantizer: Formulation and Properties
The dead-zone quantizer is a variant of the uniform scalar quantizer that incorporates a widened zero bin. For a b-bit quantizer with step size Δ, the classical form maps an input x as
This scheme is equivalent to magnitude pruning with threshold τ=d/2, combined with uniform quantization above the threshold. For deep learning compression, this mapping is fully differentiable via straight-through estimators, enabling its use in joint pruning–quantization settings (Wenshøj et al., 15 Dec 2025).
The dead-zone quantizer for image latent coefficients is similarly parameterized by step size Q and offset δ:
Δ0
where Δ1 tunes the symmetry and width of the dead zone. In practice, Δ2 provides optimal rate-distortion performance for learned codecs (Zhou et al., 2020).
Training is performed using the RaDOGAGA (Rate-Distortion Optimization Guided Autoencoder for Isometric Embedding in Euclidean Latent Space) framework. RaDOGAGA adds distortion-driven orthogonality regularization in latent space:
with metric-specific (x4, x5, x6) settings for MSE and MS-SSIM. The induced latent space becomes isometric to the selected fidelity metric, ensuring rate-distortion optimality over subsequent quantization (Zhou et al., 2020).
In the context of model compression, the dead-zone widths x7 and optional bit-widths x8 are parameterized and learned end-to-end via unconstrained scalars x9 and xˉ=clip(⌊x/Δ⌉,−Qb​,Qb​),Qb​=2b−1−1,0, using formulas:
with xˉ=clip(⌊x/Δ⌉,−Qb​,Qb​),Qb​=2b−1−1,5 typically the cross-entropy over quantized weights (Wenshøj et al., 15 Dec 2025).
3. Rate-Distortion Optimization and Sparsity Control
CoDeQ enables explicit, tunable rate-distortion control by modulating the quantizer step size xˉ=clip(⌊x/Δ⌉,−Qb​,Qb​),Qb​=2b−1−1,6 and offset xˉ=clip(⌊x/Δ⌉,−Qb​,Qb​),Qb​=2b−1−1,7 (for image compression) or the dead-zone widths xˉ=clip(⌊x/Δ⌉,−Qb​,Qb​),Qb​=2b−1−1,8 (for model compression). In the classical image codec setting, rate xˉ=clip(⌊x/Δ⌉,−Qb​,Qb​),Qb​=2b−1−1,9 and distortion ∣x∣≤Δ/20 are jointly optimized with a Lagrange multiplier ∣x∣≤Δ/21, tracing the rate-distortion curve via selection of operating points.
Under RaDOGAGA’s loss, the latent space becomes orthonormal and metric-isometric, enabling the dead-zone quantizer to function equivalently to a learned KLT/DCT transform followed by scalar quantization. Experimentally, sweeping ∣x∣≤Δ/22 achieves bitrates from ∣x∣≤Δ/23 to ∣x∣≤Δ/24, matching RD performance of individually tuned models (Zhou et al., 2020).
For model compression, varying ∣x∣≤Δ/25 at runtime directly adjusts layer sparsity, decoupled from quantization bit-width. Regularization via ∣x∣≤Δ/26 controls global sparsity, and ∣x∣≤Δ/27 encourages low precision. The optimization is differentiable and performed end-to-end; all pruning and quantization decisions are learned jointly (Wenshøj et al., 15 Dec 2025).
4. Algorithmic Pipeline and Inference Procedure
Image compression with CoDeQ is realized by the following inference steps:
Input: ∣x∣≤Δ/28,
Analysis transform: ∣x∣≤Δ/29,
Dead-zone quantization: d≥Δ0,
Entropy coding: Bits via arithmetic encoding with d≥Δ1 from d≥Δ2,
Dequantization: d≥Δ3,
Synthesis transform: d≥Δ4.
Model compression follows a fully differentiable QAT loop:
Initialize weights and parameters,
For each layer, compute d≥Δ5, d≥Δ6, d≥Δ7, d≥Δ8,
Quantize/prune: d≥Δ9,
Compute task loss and q~​(x;Δ,d)={0,​∣x∣≤d/2, sign(x)(d/2+Δclip(⌊Δ∣x∣−d/2​⌉,−Qb​,Qb​))),​∣x∣>d/2.​0 regularizers,
Backpropagate using STE for non-differentiable components,
PSNR: CoDeQ deviates by q~​(x;Δ,d)={0,​∣x∣≤d/2, sign(x)(d/2+Δclip(⌊Δ∣x∣−d/2​⌉,−Qb​,Qb​))),​∣x∣>d/2.​1 dB from multi-model baseline,
MS-SSIM: deviation within q~​(x;Δ,d)={0,​∣x∣≤d/2, sign(x)(d/2+Δclip(⌊Δ∣x∣−d/2​⌉,−Qb​,Qb​))),​∣x∣>d/2.​2 dB,
Best results for dead-zone offset q~​(x;Δ,d)={0,​∣x∣≤d/2, sign(x)(d/2+Δclip(⌊Δ∣x∣−d/2​⌉,−Qb​,Qb​))),​∣x∣>d/2.​3 (Zhou et al., 2020).
Visual inspection indicates superior retention of fine textures at low bitrates compared to non–CoDeQ baselines, with minor edge artifacts attributed to scalar uniform quantization. At high bitrate, reconstructions are nearly identical.
For ImageNet/ResNet-18 model compression:
Full precision baseline: q~​(x;Δ,d)={0,​∣x∣≤d/2, sign(x)(d/2+Δclip(⌊Δ∣x∣−d/2​⌉,−Qb​,Qb​))),​∣x∣>d/2.​4 Top-1, q~​(x;Δ,d)={0,​∣x∣≤d/2, sign(x)(d/2+Δclip(⌊Δ∣x∣−d/2​⌉,−Qb​,Qb​))),​∣x∣>d/2.​5 BOPs,
Layer-wise analysis shows CoDeQ assigns higher bit-width/lower sparsity to first/last layers and lower bit-width/higher sparsity to middle layers, matching canonical heuristic allocations but discovered automatically in the end-to-end process (Wenshøj et al., 15 Dec 2025).
6. Discussion: Implementation, Limitations, and Extensions
CoDeQ offers several operational advantages:
Eliminates the need for auxiliary search phases or parameter selection procedures outside the training loop,
Pruning and quantization are unified, architecture-agnostic, and controlled by global regularization,
Limitations include the use of layer-wise granularity for quantization and pruning; finer granularity (channel-, group-, or block-wise) and structured sparsity remain open for future development. The current scale factors utilize absmax statistics; direct learning of scale could further decrease quantization error.
Potential extensions of the CoDeQ framework are:
Per-channel/group dead-zone control for sparsity granularity,
Joint scale learning with regularization,
Structured dead-zone quantization for hardware exploitation,
Integration with post-training coding (e.g., Huffman, distillation) for maximal compression.
CoDeQ bridges magnitude pruning and uniform quantization, leveraging dead-zone properties for efficient compression in both data and model domains, with strong empirical performance and minimal operational complexity (Zhou et al., 2020, Wenshøj et al., 15 Dec 2025).