Compression with Dead-Zone Quantizer (CoDeQ)
- Compression with Dead-zone Quantizer (CoDeQ) is a unified methodology that integrates dead-zone scalar quantization to perform both magnitude pruning and uniform quantization in a single, differentiable operator.
- It enables explicit rate-distortion and sparsity control by tuning quantizer parameters such as step size, offset, and dead-zone width, thereby optimizing performance across image and model compression tasks.
- Empirical results show that CoDeQ matches the rate-distortion performance of state-of-the-art codecs and achieves competitive neural network accuracies with reduced bit operations and enhanced sparsity.
Compression with Dead-zone Quantizer (CoDeQ) represents a unified methodology for signal and model compression that leverages the properties of dead-zone scalar quantization. Dead-zone quantizers introduce adjustable regions around zero where all values are mapped to zero, effecting both quantization and magnitude pruning in a single operator. This principle enables efficient image compression and supports highly sparse, low-precision neural networks in a fully differentiable, end-to-end optimization procedure. CoDeQ, as formalized in (Zhou et al., 2020) and further generalized for model compression in (Wenshøj et al., 15 Dec 2025), provides explicit control over rate-distortion tradeoffs and network sparsity by varying quantizer parameters, with provable near-optimal performance across a spectrum of applications.
1. Dead-Zone Quantizer: Formulation and Properties
The dead-zone quantizer is a variant of the uniform scalar quantizer that incorporates a widened zero bin. For a -bit quantizer with step size , the classical form maps an input %%%%2%%%% as
assigning values within to zero, thereby pruning small coefficients.
CoDeQ generalizes this with an explicit dead-zone width :
This scheme is equivalent to magnitude pruning with threshold , combined with uniform quantization above the threshold. For deep learning compression, this mapping is fully differentiable via straight-through estimators, enabling its use in joint pruning–quantization settings (Wenshøj et al., 15 Dec 2025).
The dead-zone quantizer for image latent coefficients is similarly parameterized by step size and offset :
where tunes the symmetry and width of the dead zone. In practice, provides optimal rate-distortion performance for learned codecs (Zhou et al., 2020).
2. Network Architectures and Training Protocols
CoDeQ-based image compression methods employ autoencoder architectures in the tradition of Ballé et al. (2017). The encoder consists of three strided convolutions with GDN nonlinearities, mapping patches to latent representations. The decoder features three transposed convolutions with inverse GDN. Bottleneck dimensionality is fixed at 128 coefficients per patch; models are trained over 1 million steps with batch size 8 and learning rate (Zhou et al., 2020).
Training is performed using the RaDOGAGA (Rate-Distortion Optimization Guided Autoencoder for Isometric Embedding in Euclidean Latent Space) framework. RaDOGAGA adds distortion-driven orthogonality regularization in latent space:
- Primary distortion: (MSE or MS-SSIM),
- Jacobian orthogonality: with and ,
- Log-rate penalty via a learned CDF prior .
The total loss is
with metric-specific (, , ) settings for MSE and MS-SSIM. The induced latent space becomes isometric to the selected fidelity metric, ensuring rate-distortion optimality over subsequent quantization (Zhou et al., 2020).
In the context of model compression, the dead-zone widths and optional bit-widths are parameterized and learned end-to-end via unconstrained scalars and , using formulas:
and quantization step
The optimization objective is
with typically the cross-entropy over quantized weights (Wenshøj et al., 15 Dec 2025).
3. Rate-Distortion Optimization and Sparsity Control
CoDeQ enables explicit, tunable rate-distortion control by modulating the quantizer step size and offset (for image compression) or the dead-zone widths (for model compression). In the classical image codec setting, rate and distortion are jointly optimized with a Lagrange multiplier , tracing the rate-distortion curve via selection of operating points.
Under RaDOGAGA’s loss, the latent space becomes orthonormal and metric-isometric, enabling the dead-zone quantizer to function equivalently to a learned KLT/DCT transform followed by scalar quantization. Experimentally, sweeping achieves bitrates from to , matching RD performance of individually tuned models (Zhou et al., 2020).
For model compression, varying at runtime directly adjusts layer sparsity, decoupled from quantization bit-width. Regularization via controls global sparsity, and encourages low precision. The optimization is differentiable and performed end-to-end; all pruning and quantization decisions are learned jointly (Wenshøj et al., 15 Dec 2025).
4. Algorithmic Pipeline and Inference Procedure
Image compression with CoDeQ is realized by the following inference steps:
- Input: ,
- Analysis transform: ,
- Dead-zone quantization: ,
- Entropy coding: Bits via arithmetic encoding with from ,
- Dequantization: ,
- Synthesis transform: .
Model compression follows a fully differentiable QAT loop:
- Initialize weights and parameters,
- For each layer, compute , , , ,
- Quantize/prune: ,
- Compute task loss and regularizers,
- Backpropagate using STE for non-differentiable components,
- Update parameters via gradient descent (Wenshøj et al., 15 Dec 2025).
5. Empirical Results and Rate-Sparsity Benchmarks
On Kodak images, CoDeQ’s variable-rate method matches the rate-distortion envelope of independently trained Ballé et al. (2017) models for both PSNR and MS-SSIM criteria:
- PSNR: CoDeQ deviates by dB from multi-model baseline,
- MS-SSIM: deviation within dB,
- Best results for dead-zone offset (Zhou et al., 2020).
Visual inspection indicates superior retention of fine textures at low bitrates compared to non–CoDeQ baselines, with minor edge artifacts attributed to scalar uniform quantization. At high bitrate, reconstructions are nearly identical.
For ImageNet/ResNet-18 model compression:
- Full precision baseline: Top-1, BOPs,
- SQL (joint ADMM): , BOPs,
- QST-B: , BOPs,
- CoDeQ (fixed 4-bit + learned sparsity): , BOPs,
- CoDeQ (mixed precision + learned sparsity): , BOPs.
Layer-wise analysis shows CoDeQ assigns higher bit-width/lower sparsity to first/last layers and lower bit-width/higher sparsity to middle layers, matching canonical heuristic allocations but discovered automatically in the end-to-end process (Wenshøj et al., 15 Dec 2025).
6. Discussion: Implementation, Limitations, and Extensions
CoDeQ offers several operational advantages:
- Eliminates the need for auxiliary search phases or parameter selection procedures outside the training loop,
- Pruning and quantization are unified, architecture-agnostic, and controlled by global regularization,
- Fixed 4-bit variant matches mixed-precision performance, favoring hardware compatibility.
Limitations include the use of layer-wise granularity for quantization and pruning; finer granularity (channel-, group-, or block-wise) and structured sparsity remain open for future development. The current scale factors utilize absmax statistics; direct learning of scale could further decrease quantization error.
Potential extensions of the CoDeQ framework are:
- Per-channel/group dead-zone control for sparsity granularity,
- Joint scale learning with regularization,
- Structured dead-zone quantization for hardware exploitation,
- Integration with post-training coding (e.g., Huffman, distillation) for maximal compression.
CoDeQ bridges magnitude pruning and uniform quantization, leveraging dead-zone properties for efficient compression in both data and model domains, with strong empirical performance and minimal operational complexity (Zhou et al., 2020, Wenshøj et al., 15 Dec 2025).