Papers
Topics
Authors
Recent
2000 character limit reached

Turbo-DDCM: Efficient Diffusion Image Compression

Updated 16 November 2025
  • Turbo-DDCM is an efficient zero-shot diffusion-based image compression method that employs closed‐form multi-atom codebook selection to reduce computational overhead.
  • It substantially cuts reverse diffusion steps by up to 95%, while offering priority-aware and distortion-controlled variants for enhanced region-of-interest fidelity and targeted PSNR control.
  • Experimental results show Turbo-DDCM achieves competitive PSNR, LPIPS, and FID metrics at dramatically lower runtimes compared to traditional diffusion codecs.

Turbo-DDCM is an efficient and flexible zero-shot diffusion-based image compression methodology that advances prior Denoising Diffusion Codebook Models (DDCMs) by introducing a closed-form, multi-atom codebook selection and improved bitstream protocols. Turbo-DDCM substantially reduces the number of reverse diffusion steps required for image reconstruction, thereby enabling orders-of-magnitude speedup over existing zero-shot diffusion codecs, while retaining competitive perceptual and distortion metrics against state-of-the-art methods. The design offers two notable variants—priority-aware and distortion-controlled compression—that allow explicit user control over region-of-interest and distortion targets within the zero-shot paradigm (Vaisman et al., 9 Nov 2025).

1. Theoretical Foundations and Connection to DDCMs

Turbo-DDCM builds upon the paradigm of denoising diffusion probabilistic models (DDPMs), leveraging their iterative noising and denoising stochastic processes. The forward process applies a sequence of Gaussian noise injections: xt=αˉtx0+1αˉtϵ,ϵN(0,I)x_t = \sqrt{\bar\alpha_t}\,x_0 + \sqrt{1 - \bar\alpha_t}\,\epsilon, \quad \epsilon \sim \mathcal{N}(0, I) with αˉt=s=1tαs\bar\alpha_t = \prod_{s=1}^t \alpha_s encoding the cumulative product of noise variance schedules.

The DDCM framework replaces the random Gaussian noise in the reverse generative step,

xt1=μθ(xt,t)+σtz,zN(0,I),x_{t-1} = \mu_\theta(x_t, t) + \sigma_t\,z, \quad z \sim \mathcal{N}(0, I),

with disambiguating noise vectors zt(k)z_t^{(k)} from a reproducible codebook of KK atoms, enabling discrete bit-indexed control over the latent diffusion trajectory. Standard DDCM with M=1M=1 selects at each step the codebook atom maximizing alignment with the stepwise residual rt=x0x^0tr_t = x_0 - \hat x_{0|t}, storing log2K\lceil \log_2 K \rceil bits per denoising step. Multi-atom DDCM generalizes this via matching pursuit (MP), selecting M>1M>1 atoms per step in a sequential manner, but at significant computational cost.

Turbo-DDCM advances this by performing simultaneous closed-form selection of MM atoms, optimizing

zt=argmincRKZtcrt22subject toc0=M,ciV{0}z_t^* = \arg\min_{c \in \mathbb{R}^K} \|Z_t\,c - r_t\|_2^2 \quad \text{subject to} \quad \|c\|_0 = M,\, c_i \in \mathcal{V} \cup \{0\}

where ZtZ_t collects the KK codebook atoms and V\mathcal{V} is a quantized coefficient set. Under near-orthogonality (typical of i.i.d. Gaussian codebooks), this reduces to top-MM selection by the absolute inner product with the residual, assigning sign(αi)\operatorname{sign}(\alpha_i) as the coefficient for the selected atoms (αi=zt(i),rt\alpha_i = \langle z_t^{(i)}, r_t \rangle), followed by normalization. This eliminates the need for iterative search and drastically reduces the required number of denoiser calls.

2. Algorithmic Workflow and Bitstream Protocol

Turbo-DDCM’s encoding and decoding process consists of:

  • Encoding steps (t=T,,N+1t = T, \dots, N+1):
  1. Compute the residual rt=x0x^0tr_t = x_0 - \hat x_{0|t}.
  2. Calculate inner products αi=zt(i),rt\alpha_i = \langle z_t^{(i)}, r_t \rangle for all codebook atoms.
  3. Select the top-MM indices in α|\alpha|.
  4. Set coefficients ci=sign(αi)c_i = \operatorname{sign}(\alpha_i) for selected ii, else 0.
  5. Serialize the subset index as its lexicographic rank (log2(KM)\lceil \log_2 \binom{K}{M} \rceil bits) and the MM coefficient values (MCM C bits).
  6. Update xt1x_{t-1} by injecting the normalized linear combination of atoms: zt=Ztc/std(Ztc)z_t^* = Z_t c^* / \mathrm{std}(Z_t c^*).
  • Bit Protocol:
    • Transmit lexicographic rank of subset SS (size MM from KK), not ordered indices, eliminating permutation redundancy.
    • For TN1T-N-1 steps, total bits per pixel:

    BPPTurbo=(TN1)(log2(KM)+MC)pixels\mathrm{BPP}_{\rm Turbo} = \frac{(T-N-1)\,(\lceil \log_2 \binom{K}{M} \rceil + M C)}{\text{pixels}} - No bits are transmitted for the NN final deterministic DDIM steps, which conclude image restoration.

  • Computational Complexity:

    • Each encoding/decoding step costs Θ(Kd+KlogM)\Theta(Kd + K \log M), enabling feasible settings (K256K \sim 256, M6M \sim 6) with total T20T \sim 20–30, i.e., a 95% denoiser call reduction from DDCM (T500T \sim 500–1000).
  • Decoding: Follows the identical reverse process, reconstructing x0x_0 with codebook indices and coefficients provided by the bitstream.

3. Flexible Compression Variants

Turbo-DDCM supports two algorithmic extensions:

Priority-Aware Turbo-DDCM:

Allows explicit focus on user-specified image regions of interest (ROI), by replacing the residuals in step (5) with a masked version p(x0x^0t)p \odot (x_0 - \hat x_{0|t}), pR+dp \in \mathbb{R}_+^d. This increases atom allocation where fidelity is prioritized, without altering bitrate or runtime. Empirical results indicate substantial ROI fidelity gains at fixed BPP.

Distortion-Controlled Turbo-DDCM:

Addresses inherent PSNR variability at fixed bitrates in zero-shot compression by exploiting a strong empirical linear correlation between Turbo-DDCM’s PSNR and the lossless JPEG compressed file size: PSNRTurbo(b)abbbsizeJPEG(q=100)\mathrm{PSNR}_{\rm Turbo}(b) \approx a_b - b_b \cdot \mathrm{size}_{\rm JPEG}(q=100) Linear predictors trained across BPPs guide per-image selection of minimal bb achieving a target PSNR. On test images, this reduces PSNR root-mean-square error by over 40% compared to naïve BPP selection.

4. Experimental Results and Benchmarks

Turbo-DDCM was evaluated on the Kodak24 (5122512^2) and DIV2K (7682768^2) datasets, in comparison with other zero-shot and trained methods. Table 1 at BPP 0.05\approx 0.05 reports:

Method PSNR (dB) LPIPS FID Time (s/img)
BPG 24.1 0.25 120 0.1
PerCo (SD) 25.6 0.15 22 1.0
DiffC 25.2 0.18 30 10
DDCM 24.8 0.20 45 65
Turbo-DDCM 25.3 0.17 20 1.5

Turbo-DDCM matches or surpasses all zero-shot methods in PSNR, LPIPS, and FID, while running 3×–40× faster. Compared to trained models, only PerCo (SD) slightly outperforms in PSNR, but with lower perceptual quality (higher FID).

Ablation studies demonstrate:

  • Turbo-DDCM’s top-MM thresholded atom combinations result in equal or better angular alignment to the residual than DDCM MP, with performance continuing to improve as MM increases (unlike MP, which plateaus).
  • Empirical runtime scaling is 10210^210410^4× faster than DDCM + MP across practical (K,M,C)(K, M, C).

5. Limitations and Prospects

Turbo-DDCM introduces significant practical benefits, but several limitations remain:

  • At high bitrates, the codec’s underlying latent-diffusion backbone yields diminishing returns in PSNR, suggesting encoder/decoder distortion floors. End-to-end image-space training could potentially ameliorate this ceiling.
  • Current reverse processes require \sim20–30 diffusion steps; further efficiency gains may be possible by learning direct mappings from noisy latents to x0x_0 ("one-step zero-shot solvers").
  • A formal information-theoretic analysis of the limits and optimality of codebook-based zero-shot encoding remains open.

This suggests future research direction toward tighter latent/image coupling, adaptive codebook designs, and theoretical characterizations of zero-shot diffusion compression.

6. Context and Significance

Turbo-DDCM enables zero-shot diffusion-based compression to become a practical tool for both research and applied imaging, offering sub-2s per-image decode times and fine-grained control of bitrate, ROI, and distortion without dataset-specific training. The method’s closed-form multi-atom codebook selection and bit-efficient indexing protocol lead to an approximately 95% reduction in denoiser calls and a \sim40% BPP reduction over naïve bit-packing approaches, shifting the zero-shot diffusion codec paradigm from an academic concept toward routine deployment in bandwidth-sensitive imaging scenarios.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)
Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Turbo-DDCM.