Papers
Topics
Authors
Recent
2000 character limit reached

Compression with Dead-Zone Quantizer (CoDeQ)

Updated 22 December 2025
  • Compression with Dead-zone Quantizer (CoDeQ) is a unified methodology that integrates dead-zone scalar quantization to perform both magnitude pruning and uniform quantization in a single, differentiable operator.
  • It enables explicit rate-distortion and sparsity control by tuning quantizer parameters such as step size, offset, and dead-zone width, thereby optimizing performance across image and model compression tasks.
  • Empirical results show that CoDeQ matches the rate-distortion performance of state-of-the-art codecs and achieves competitive neural network accuracies with reduced bit operations and enhanced sparsity.

Compression with Dead-zone Quantizer (CoDeQ) represents a unified methodology for signal and model compression that leverages the properties of dead-zone scalar quantization. Dead-zone quantizers introduce adjustable regions around zero where all values are mapped to zero, effecting both quantization and magnitude pruning in a single operator. This principle enables efficient image compression and supports highly sparse, low-precision neural networks in a fully differentiable, end-to-end optimization procedure. CoDeQ, as formalized in (Zhou et al., 2020) and further generalized for model compression in (Wenshøj et al., 15 Dec 2025), provides explicit control over rate-distortion tradeoffs and network sparsity by varying quantizer parameters, with provable near-optimal performance across a spectrum of applications.

1. Dead-Zone Quantizer: Formulation and Properties

The dead-zone quantizer is a variant of the uniform scalar quantizer that incorporates a widened zero bin. For a bb-bit quantizer with step size Δ\Delta, the classical form maps an input %%%%2%%%% as

xˉ=clip(x/Δ,Qb,Qb),Qb=2b11,\bar{x} = \mathrm{clip}\Bigl(\lfloor x/\Delta \rceil,\,-Q_b,\,Q_b\Bigr), \quad Q_b = 2^{b-1}-1,

assigning values within xΔ/2|x| \leq \Delta/2 to zero, thereby pruning small coefficients.

CoDeQ generalizes this with an explicit dead-zone width dΔd \geq \Delta:

q~(x;Δ,d)={0,xd/2, sign(x)(d/2+Δclip(xd/2Δ,Qb,Qb))),x>d/2.\tilde{q}(x;\Delta,d) = \begin{cases} 0, & |x| \leq d/2, \ \mathrm{sign}(x)\,\left(d/2 + \Delta\, \mathrm{clip}\left(\left\lfloor \frac{|x| - d/2}{\Delta} \rceil, -Q_b, Q_b\right)\right) \right), & |x| > d/2. \end{cases}

This scheme is equivalent to magnitude pruning with threshold τ=d/2\tau = d/2, combined with uniform quantization above the threshold. For deep learning compression, this mapping is fully differentiable via straight-through estimators, enabling its use in joint pruning–quantization settings (Wenshøj et al., 15 Dec 2025).

The dead-zone quantizer for image latent coefficients is similarly parameterized by step size QQ and offset δ\delta:

q=sign(y)yQ+δ,y^=qQ,q = \mathrm{sign}(y)\,\left\lfloor \frac{|y|}{Q} + \delta \right\rfloor, \qquad \hat{y} = qQ,

where δ[0,0.5]\delta \in [0,0.5] tunes the symmetry and width of the dead zone. In practice, δ=0.45\delta = 0.45 provides optimal rate-distortion performance for learned codecs (Zhou et al., 2020).

2. Network Architectures and Training Protocols

CoDeQ-based image compression methods employ autoencoder architectures in the tradition of Ballé et al. (2017). The encoder fθf_\theta consists of three strided convolutions with GDN nonlinearities, mapping 256×256×3256{\times}256{\times}3 patches to 1×1×1281{\times}1{\times}128 latent representations. The decoder gϕg_\phi features three transposed convolutions with inverse GDN. Bottleneck dimensionality is fixed at 128 coefficients per patch; models are trained over 1 million steps with batch size 8 and learning rate 10410^{-4} (Zhou et al., 2020).

Training is performed using the RaDOGAGA (Rate-Distortion Optimization Guided Autoencoder for Isometric Embedding in Euclidean Latent Space) framework. RaDOGAGA adds distortion-driven orthogonality regularization in latent space:

  • Primary distortion: D1=D(x,x^)D_1 = D(x, \hat{x}) (MSE or MS-SSIM),
  • Jacobian orthogonality: D2=D(x^,x~)D_2 = D(\hat{x}, \tilde{x}) with x~=gϕ(y+ϵ)\tilde{x} = g_\phi(y + \epsilon) and ϵUniform(α/2,α/2)\epsilon \sim \mathrm{Uniform}(-\alpha/2, \alpha/2),
  • Log-rate penalty via a learned CDF prior Pψ(y)=CDFψ(y+α/2)CDFψ(yα/2)P_\psi(y) = \mathrm{CDF}_\psi(y+\alpha/2) - \mathrm{CDF}_\psi(y-\alpha/2).

The total loss is

L(θ,ϕ,ψ)=log2Pψ(y)+λ1logD1+λ2D2,L(\theta, \phi, \psi) = -\log_2 P_\psi(y) + \lambda_1 \log D_1 + \lambda_2 D_2,

with metric-specific (λ1\lambda_1, λ2\lambda_2, α\alpha) settings for MSE and MS-SSIM. The induced latent space becomes isometric to the selected fidelity metric, ensuring rate-distortion optimality over subsequent quantization (Zhou et al., 2020).

In the context of model compression, the dead-zone widths dd_\ell and optional bit-widths bb_\ell are parameterized and learned end-to-end via unconstrained scalars θdz,\theta_{\mathrm{dz},\ell} and θbit,\theta_{\mathrm{bit},\ell}, using formulas:

d(θdz,)=2R[1tanhθdz,],R=maxiw,i,d_\ell(\theta_{\mathrm{dz},\ell}) = 2 R_\ell [1 - \tanh|\theta_{\mathrm{dz},\ell}|], \quad R_\ell = \max_i |w_{\ell,i}|,

b(θbit,)=tanhθbit,(bmaxbmin)+bmin,b_\ell(\theta_{\mathrm{bit},\ell}) = \left\lfloor \tanh |\theta_{\mathrm{bit},\ell}| (b_{\max}-b_{\min}) + b_{\min} \right\rceil,

and quantization step

Δ=2Rd2b121.\Delta_\ell = \frac{2 R_\ell - d_\ell}{2^{b_\ell - 1} \cdot 2 - 1}.

The optimization objective is

LCoDeQ=Ltask({q~(w;Δ,d)})+λdzθdz,22+λbitθbit,22,\mathcal{L}_{\mathrm{CoDeQ}} = \mathcal{L}_{\mathrm{task}}(\{\tilde{q}(w_\ell; \Delta_\ell, d_\ell)\}_\ell) + \lambda_{\mathrm{dz}} \sum_\ell \|\theta_{\mathrm{dz},\ell}\|_2^2 + \lambda_{\mathrm{bit}} \sum_\ell \|\theta_{\mathrm{bit},\ell}\|_2^2,

with Ltask\mathcal{L}_{\mathrm{task}} typically the cross-entropy over quantized weights (Wenshøj et al., 15 Dec 2025).

3. Rate-Distortion Optimization and Sparsity Control

CoDeQ enables explicit, tunable rate-distortion control by modulating the quantizer step size QQ and offset δ\delta (for image compression) or the dead-zone widths dd_\ell (for model compression). In the classical image codec setting, rate R=Ex[log2P(q(x))]R = \mathbb{E}_x[-\log_2 P(q(x))] and distortion D=Ex[Dist(x,x^)]D = \mathbb{E}_x[\mathrm{Dist}(x, \hat{x})] are jointly optimized with a Lagrange multiplier λ\lambda, tracing the rate-distortion curve via selection of operating points.

Under RaDOGAGA’s loss, the latent space becomes orthonormal and metric-isometric, enabling the dead-zone quantizer to function equivalently to a learned KLT/DCT transform followed by scalar quantization. Experimentally, sweeping Q[0.5,4.0]Q \in [0.5, 4.0] achieves bitrates from 0.1bpp0.1\,\mathrm{bpp} to 1.0bpp1.0\,\mathrm{bpp}, matching RD performance of individually tuned models (Zhou et al., 2020).

For model compression, varying dd_\ell at runtime directly adjusts layer sparsity, decoupled from quantization bit-width. Regularization via λdz\lambda_{\mathrm{dz}} controls global sparsity, and λbit\lambda_{\mathrm{bit}} encourages low precision. The optimization is differentiable and performed end-to-end; all pruning and quantization decisions are learned jointly (Wenshøj et al., 15 Dec 2025).

4. Algorithmic Pipeline and Inference Procedure

Image compression with CoDeQ is realized by the following inference steps:

  • Input: xx,
  • Analysis transform: y=fθ(x)y = f_\theta(x),
  • Dead-zone quantization: q=sign(y)y/Q+δq = \mathrm{sign}(y) \cdot \lfloor |y|/Q + \delta \rfloor,
  • Entropy coding: Bits via arithmetic encoding with P(q)P(q) from CDFψCDF_\psi,
  • Dequantization: y^=qQ\hat{y} = qQ,
  • Synthesis transform: x^=gϕ(y^)\hat{x} = g_\phi(\hat{y}).

Model compression follows a fully differentiable QAT loop:

  • Initialize weights and parameters,
  • For each layer, compute RR_\ell, dd_\ell, bb_\ell, Δ\Delta_\ell,
  • Quantize/prune: w^=q~(w;Δ,d)\hat{w}_\ell = \tilde{q}(w_\ell; \Delta_\ell, d_\ell),
  • Compute task loss and 2\ell_2 regularizers,
  • Backpropagate using STE for non-differentiable components,
  • Update parameters via gradient descent (Wenshøj et al., 15 Dec 2025).

5. Empirical Results and Rate-Sparsity Benchmarks

On Kodak images, CoDeQ’s variable-rate method matches the rate-distortion envelope of independently trained Ballé et al. (2017) models for both PSNR and MS-SSIM criteria:

  • PSNR: CoDeQ deviates by ±0.05\pm 0.05 dB from multi-model baseline,
  • MS-SSIM: deviation within ±0.01\pm 0.01 dB,
  • Best results for dead-zone offset δ=0.45\delta=0.45 (Zhou et al., 2020).

Visual inspection indicates superior retention of fine textures at low bitrates compared to non–CoDeQ baselines, with minor edge artifacts attributed to scalar uniform quantization. At high bitrate, reconstructions are nearly identical.

For ImageNet/ResNet-18 model compression:

  • Full precision baseline: 70.30%70.30\% Top-1, 100%100\% BOPs,
  • SQL (joint ADMM): 68.60%68.60\%, 6.20%6.20\% BOPs,
  • QST-B: 69.90%69.90\%, 5.00%5.00\% BOPs,
  • CoDeQ (fixed 4-bit + learned sparsity): 69.83±0.14%69.83\pm 0.14\%, 4.75±0.19%4.75\pm 0.19\% BOPs,
  • CoDeQ (mixed precision + learned sparsity): 69.81±0.09%69.81\pm 0.09\%, 4.42±0.05%4.42\pm 0.05\% BOPs.

Layer-wise analysis shows CoDeQ assigns higher bit-width/lower sparsity to first/last layers and lower bit-width/higher sparsity to middle layers, matching canonical heuristic allocations but discovered automatically in the end-to-end process (Wenshøj et al., 15 Dec 2025).

6. Discussion: Implementation, Limitations, and Extensions

CoDeQ offers several operational advantages:

  • Eliminates the need for auxiliary search phases or parameter selection procedures outside the training loop,
  • Pruning and quantization are unified, architecture-agnostic, and controlled by global regularization,
  • Fixed 4-bit variant matches mixed-precision performance, favoring hardware compatibility.

Limitations include the use of layer-wise granularity for quantization and pruning; finer granularity (channel-, group-, or block-wise) and structured sparsity remain open for future development. The current scale factors utilize absmax statistics; direct learning of scale could further decrease quantization error.

Potential extensions of the CoDeQ framework are:

  • Per-channel/group dead-zone control for sparsity granularity,
  • Joint scale learning with regularization,
  • Structured dead-zone quantization for hardware exploitation,
  • Integration with post-training coding (e.g., Huffman, distillation) for maximal compression.

CoDeQ bridges magnitude pruning and uniform quantization, leveraging dead-zone properties for efficient compression in both data and model domains, with strong empirical performance and minimal operational complexity (Zhou et al., 2020, Wenshøj et al., 15 Dec 2025).

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Compression with Dead-zone Quantizer (CoDeQ).