Compression with Dead-Zone Quantizer (CoDeQ)

Updated 22 December 2025

Compression with Dead-zone Quantizer (CoDeQ) is a unified methodology that integrates dead-zone scalar quantization to perform both magnitude pruning and uniform quantization in a single, differentiable operator.
It enables explicit rate-distortion and sparsity control by tuning quantizer parameters such as step size, offset, and dead-zone width, thereby optimizing performance across image and model compression tasks.
Empirical results show that CoDeQ matches the rate-distortion performance of state-of-the-art codecs and achieves competitive neural network accuracies with reduced bit operations and enhanced sparsity.

Compression with Dead-zone Quantizer (CoDeQ) represents a unified methodology for signal and model compression that leverages the properties of dead-zone scalar quantization. Dead-zone quantizers introduce adjustable regions around zero where all values are mapped to zero, effecting both quantization and magnitude pruning in a single operator. This principle enables efficient image compression and supports highly sparse, low-precision neural networks in a fully differentiable, end-to-end optimization procedure. CoDeQ, as formalized in (Zhou et al., 2020) and further generalized for model compression in (Wenshøj et al., 15 Dec 2025), provides explicit control over rate-distortion tradeoffs and network sparsity by varying quantizer parameters, with provable near-optimal performance across a spectrum of applications.

1. Dead-Zone Quantizer: Formulation and Properties

The dead-zone quantizer is a variant of the uniform scalar quantizer that incorporates a widened zero bin. For a $b$ -bit quantizer with step size $\Delta$ , the classical form maps an input %%%%2%%%% as

$\bar{x} = \mathrm{clip}\Bigl(\lfloor x/\Delta \rceil,\,-Q_b,\,Q_b\Bigr), \quad Q_b = 2^{b-1}-1,$

assigning values within $|x| \leq \Delta/2$ to zero, thereby pruning small coefficients.

CoDeQ generalizes this with an explicit dead-zone width $d \geq \Delta$ :

$\tilde{q}(x;\Delta,d) = \begin{cases} 0, & |x| \leq d/2, \ \mathrm{sign}(x)\,\left(d/2 + \Delta\, \mathrm{clip}\left(\left\lfloor \frac{|x| - d/2}{\Delta} \rceil, -Q_b, Q_b\right)\right) \right), & |x| > d/2. \end{cases}$

This scheme is equivalent to magnitude pruning with threshold $\tau = d/2$ , combined with uniform quantization above the threshold. For deep learning compression, this mapping is fully differentiable via straight-through estimators, enabling its use in joint pruning–quantization settings (Wenshøj et al., 15 Dec 2025).

The dead-zone quantizer for image latent coefficients is similarly parameterized by step size $Q$ and offset $\delta$ :

$q = \mathrm{sign}(y)\,\left\lfloor \frac{|y|}{Q} + \delta \right\rfloor, \qquad \hat{y} = qQ,$

where $\delta \in [0,0.5]$ tunes the symmetry and width of the dead zone. In practice, $\delta = 0.45$ provides optimal rate-distortion performance for learned codecs (Zhou et al., 2020).

2. Network Architectures and Training Protocols

CoDeQ-based image compression methods employ autoencoder architectures in the tradition of Ballé et al. (2017). The encoder $f_\theta$ consists of three strided convolutions with GDN nonlinearities, mapping $256{\times}256{\times}3$ patches to $1{\times}1{\times}128$ latent representations. The decoder $g_\phi$ features three transposed convolutions with inverse GDN. Bottleneck dimensionality is fixed at 128 coefficients per patch; models are trained over 1 million steps with batch size 8 and learning rate $10^{-4}$ (Zhou et al., 2020).

Training is performed using the RaDOGAGA (Rate-Distortion Optimization Guided Autoencoder for Isometric Embedding in Euclidean Latent Space) framework. RaDOGAGA adds distortion-driven orthogonality regularization in latent space:

Primary distortion: $D_1 = D(x, \hat{x})$ (MSE or MS-SSIM),
Jacobian orthogonality: $D_2 = D(\hat{x}, \tilde{x})$ with $\tilde{x} = g_\phi(y + \epsilon)$ and $\epsilon \sim \mathrm{Uniform}(-\alpha/2, \alpha/2)$ ,
Log-rate penalty via a learned CDF prior $P_\psi(y) = \mathrm{CDF}_\psi(y+\alpha/2) - \mathrm{CDF}_\psi(y-\alpha/2)$ .

The total loss is

$L(\theta, \phi, \psi) = -\log_2 P_\psi(y) + \lambda_1 \log D_1 + \lambda_2 D_2,$

with metric-specific ( $\lambda_1$ , $\lambda_2$ , $\alpha$ ) settings for MSE and MS-SSIM. The induced latent space becomes isometric to the selected fidelity metric, ensuring rate-distortion optimality over subsequent quantization (Zhou et al., 2020).

In the context of model compression, the dead-zone widths $d_\ell$ and optional bit-widths $b_\ell$ are parameterized and learned end-to-end via unconstrained scalars $\theta_{\mathrm{dz},\ell}$ and $\theta_{\mathrm{bit},\ell}$ , using formulas:

$d_\ell(\theta_{\mathrm{dz},\ell}) = 2 R_\ell [1 - \tanh|\theta_{\mathrm{dz},\ell}|], \quad R_\ell = \max_i |w_{\ell,i}|,$

$b_\ell(\theta_{\mathrm{bit},\ell}) = \left\lfloor \tanh |\theta_{\mathrm{bit},\ell}| (b_{\max}-b_{\min}) + b_{\min} \right\rceil,$

and quantization step

$\Delta_\ell = \frac{2 R_\ell - d_\ell}{2^{b_\ell - 1} \cdot 2 - 1}.$

The optimization objective is

$\mathcal{L}_{\mathrm{CoDeQ}} = \mathcal{L}_{\mathrm{task}}(\{\tilde{q}(w_\ell; \Delta_\ell, d_\ell)\}_\ell) + \lambda_{\mathrm{dz}} \sum_\ell \|\theta_{\mathrm{dz},\ell}\|_2^2 + \lambda_{\mathrm{bit}} \sum_\ell \|\theta_{\mathrm{bit},\ell}\|_2^2,$

with $\mathcal{L}_{\mathrm{task}}$ typically the cross-entropy over quantized weights (Wenshøj et al., 15 Dec 2025).

3. Rate-Distortion Optimization and Sparsity Control

CoDeQ enables explicit, tunable rate-distortion control by modulating the quantizer step size $Q$ and offset $\delta$ (for image compression) or the dead-zone widths $d_\ell$ (for model compression). In the classical image codec setting, rate $R = \mathbb{E}_x[-\log_2 P(q(x))]$ and distortion $D = \mathbb{E}_x[\mathrm{Dist}(x, \hat{x})]$ are jointly optimized with a Lagrange multiplier $\lambda$ , tracing the rate-distortion curve via selection of operating points.

Under RaDOGAGA’s loss, the latent space becomes orthonormal and metric-isometric, enabling the dead-zone quantizer to function equivalently to a learned KLT/DCT transform followed by scalar quantization. Experimentally, sweeping $Q \in [0.5, 4.0]$ achieves bitrates from $0.1\,\mathrm{bpp}$ to $1.0\,\mathrm{bpp}$ , matching RD performance of individually tuned models (Zhou et al., 2020).

For model compression, varying $d_\ell$ at runtime directly adjusts layer sparsity, decoupled from quantization bit-width. Regularization via $\lambda_{\mathrm{dz}}$ controls global sparsity, and $\lambda_{\mathrm{bit}}$ encourages low precision. The optimization is differentiable and performed end-to-end; all pruning and quantization decisions are learned jointly (Wenshøj et al., 15 Dec 2025).

4. Algorithmic Pipeline and Inference Procedure

Image compression with CoDeQ is realized by the following inference steps:

Input: $x$ ,
Analysis transform: $y = f_\theta(x)$ ,
Dead-zone quantization: $q = \mathrm{sign}(y) \cdot \lfloor |y|/Q + \delta \rfloor$ ,
Entropy coding: Bits via arithmetic encoding with $P(q)$ from $CDF_\psi$ ,
Dequantization: $\hat{y} = qQ$ ,
Synthesis transform: $\hat{x} = g_\phi(\hat{y})$ .

Model compression follows a fully differentiable QAT loop:

Initialize weights and parameters,
For each layer, compute $R_\ell$ , $d_\ell$ , $b_\ell$ , $\Delta_\ell$ ,
Quantize/prune: $\hat{w}_\ell = \tilde{q}(w_\ell; \Delta_\ell, d_\ell)$ ,
Compute task loss and $\ell_2$ regularizers,
Backpropagate using STE for non-differentiable components,
Update parameters via gradient descent (Wenshøj et al., 15 Dec 2025).

5. Empirical Results and Rate-Sparsity Benchmarks

On Kodak images, CoDeQ’s variable-rate method matches the rate-distortion envelope of independently trained Ballé et al. (2017) models for both PSNR and MS-SSIM criteria:

PSNR: CoDeQ deviates by $\pm 0.05$ dB from multi-model baseline,
MS-SSIM: deviation within $\pm 0.01$ dB,
Best results for dead-zone offset $\delta=0.45$ (Zhou et al., 2020).

Visual inspection indicates superior retention of fine textures at low bitrates compared to non–CoDeQ baselines, with minor edge artifacts attributed to scalar uniform quantization. At high bitrate, reconstructions are nearly identical.

For ImageNet/ResNet-18 model compression:

Full precision baseline: $70.30\%$ Top-1, $100\%$ BOPs,
SQL (joint ADMM): $68.60\%$ , $6.20\%$ BOPs,
QST-B: $69.90\%$ , $5.00\%$ BOPs,
CoDeQ (fixed 4-bit + learned sparsity): $69.83\pm 0.14\%$ , $4.75\pm 0.19\%$ BOPs,
CoDeQ (mixed precision + learned sparsity): $69.81\pm 0.09\%$ , $4.42\pm 0.05\%$ BOPs.

Layer-wise analysis shows CoDeQ assigns higher bit-width/lower sparsity to first/last layers and lower bit-width/higher sparsity to middle layers, matching canonical heuristic allocations but discovered automatically in the end-to-end process (Wenshøj et al., 15 Dec 2025).

6. Discussion: Implementation, Limitations, and Extensions

CoDeQ offers several operational advantages:

Eliminates the need for auxiliary search phases or parameter selection procedures outside the training loop,
Pruning and quantization are unified, architecture-agnostic, and controlled by global regularization,
Fixed 4-bit variant matches mixed-precision performance, favoring hardware compatibility.

Limitations include the use of layer-wise granularity for quantization and pruning; finer granularity (channel-, group-, or block-wise) and structured sparsity remain open for future development. The current scale factors utilize absmax statistics; direct learning of scale could further decrease quantization error.

Potential extensions of the CoDeQ framework are:

Per-channel/group dead-zone control for sparsity granularity,
Joint scale learning with regularization,
Structured dead-zone quantization for hardware exploitation,
Integration with post-training coding (e.g., Huffman, distillation) for maximal compression.

CoDeQ bridges magnitude pruning and uniform quantization, leveraging dead-zone properties for efficient compression in both data and model domains, with strong empirical performance and minimal operational complexity (Zhou et al., 2020, Wenshøj et al., 15 Dec 2025).