Turbo-DDCM: Efficient Diffusion Image Compression

Updated 16 November 2025

Turbo-DDCM is an efficient zero-shot diffusion-based image compression method that employs closed‐form multi-atom codebook selection to reduce computational overhead.
It substantially cuts reverse diffusion steps by up to 95%, while offering priority-aware and distortion-controlled variants for enhanced region-of-interest fidelity and targeted PSNR control.
Experimental results show Turbo-DDCM achieves competitive PSNR, LPIPS, and FID metrics at dramatically lower runtimes compared to traditional diffusion codecs.

Turbo-DDCM is an efficient and flexible zero-shot diffusion-based image compression methodology that advances prior Denoising Diffusion Codebook Models (DDCMs) by introducing a closed-form, multi-atom codebook selection and improved bitstream protocols. Turbo-DDCM substantially reduces the number of reverse diffusion steps required for image reconstruction, thereby enabling orders-of-magnitude speedup over existing zero-shot diffusion codecs, while retaining competitive perceptual and distortion metrics against state-of-the-art methods. The design offers two notable variants—priority-aware and distortion-controlled compression—that allow explicit user control over region-of-interest and distortion targets within the zero-shot paradigm (Vaisman et al., 9 Nov 2025).

1. Theoretical Foundations and Connection to DDCMs

Turbo-DDCM builds upon the paradigm of denoising diffusion probabilistic models (DDPMs), leveraging their iterative noising and denoising stochastic processes. The forward process applies a sequence of Gaussian noise injections: $x_t = \sqrt{\bar\alpha_t}\,x_0 + \sqrt{1 - \bar\alpha_t}\,\epsilon, \quad \epsilon \sim \mathcal{N}(0, I)$ with $\bar\alpha_t = \prod_{s=1}^t \alpha_s$ encoding the cumulative product of noise variance schedules.

The DDCM framework replaces the random Gaussian noise in the reverse generative step,

$x_{t-1} = \mu_\theta(x_t, t) + \sigma_t\,z, \quad z \sim \mathcal{N}(0, I),$

with disambiguating noise vectors $z_t^{(k)}$ from a reproducible codebook of $K$ atoms, enabling discrete bit-indexed control over the latent diffusion trajectory. Standard DDCM with $M=1$ selects at each step the codebook atom maximizing alignment with the stepwise residual $r_t = x_0 - \hat x_{0|t}$ , storing $\lceil \log_2 K \rceil$ bits per denoising step. Multi-atom DDCM generalizes this via matching pursuit (MP), selecting $M>1$ atoms per step in a sequential manner, but at significant computational cost.

Turbo-DDCM advances this by performing simultaneous closed-form selection of $M$ atoms, optimizing

$z_t^* = \arg\min_{c \in \mathbb{R}^K} \|Z_t\,c - r_t\|_2^2 \quad \text{subject to} \quad \|c\|_0 = M,\, c_i \in \mathcal{V} \cup \{0\}$

where $Z_t$ collects the $K$ codebook atoms and $\mathcal{V}$ is a quantized coefficient set. Under near-orthogonality (typical of i.i.d. Gaussian codebooks), this reduces to top- $M$ selection by the absolute inner product with the residual, assigning $\operatorname{sign}(\alpha_i)$ as the coefficient for the selected atoms ( $\alpha_i = \langle z_t^{(i)}, r_t \rangle$ ), followed by normalization. This eliminates the need for iterative search and drastically reduces the required number of denoiser calls.

2. Algorithmic Workflow and Bitstream Protocol

Turbo-DDCM’s encoding and decoding process consists of:

Encoding steps ( $t = T, \dots, N+1$ ):

Compute the residual $r_t = x_0 - \hat x_{0|t}$ .
Calculate inner products $\alpha_i = \langle z_t^{(i)}, r_t \rangle$ for all codebook atoms.
Select the top- $M$ indices in $|\alpha|$ .
Set coefficients $c_i = \operatorname{sign}(\alpha_i)$ for selected $i$ , else 0.
Serialize the subset index as its lexicographic rank ( $\lceil \log_2 \binom{K}{M} \rceil$ bits) and the $M$ coefficient values ( $M C$ bits).
Update $x_{t-1}$ by injecting the normalized linear combination of atoms: $z_t^* = Z_t c^* / \mathrm{std}(Z_t c^*)$ .

Bit Protocol:
- Transmit lexicographic rank of subset $S$ (size $M$ from $K$ ), not ordered indices, eliminating permutation redundancy.
- For $T-N-1$ steps, total bits per pixel:
$\mathrm{BPP}_{\rm Turbo} = \frac{(T-N-1)\,(\lceil \log_2 \binom{K}{M} \rceil + M C)}{\text{pixels}}$ - No bits are transmitted for the $N$ final deterministic DDIM steps, which conclude image restoration.
Computational Complexity:
- Each encoding/decoding step costs $\Theta(Kd + K \log M)$ , enabling feasible settings ( $K \sim 256$ , $M \sim 6$ ) with total $T \sim 20$ –30, i.e., a 95% denoiser call reduction from DDCM ( $T \sim 500$ –1000).
Decoding: Follows the identical reverse process, reconstructing $x_0$ with codebook indices and coefficients provided by the bitstream.

3. Flexible Compression Variants

Turbo-DDCM supports two algorithmic extensions:

Priority-Aware Turbo-DDCM:

Allows explicit focus on user-specified image regions of interest (ROI), by replacing the residuals in step (5) with a masked version $p \odot (x_0 - \hat x_{0|t})$ , $p \in \mathbb{R}_+^d$ . This increases atom allocation where fidelity is prioritized, without altering bitrate or runtime. Empirical results indicate substantial ROI fidelity gains at fixed BPP.

Distortion-Controlled Turbo-DDCM:

Addresses inherent PSNR variability at fixed bitrates in zero-shot compression by exploiting a strong empirical linear correlation between Turbo-DDCM’s PSNR and the lossless JPEG compressed file size: $\mathrm{PSNR}_{\rm Turbo}(b) \approx a_b - b_b \cdot \mathrm{size}_{\rm JPEG}(q=100)$ Linear predictors trained across BPPs guide per-image selection of minimal $b$ achieving a target PSNR. On test images, this reduces PSNR root-mean-square error by over 40% compared to naïve BPP selection.

4. Experimental Results and Benchmarks

Turbo-DDCM was evaluated on the Kodak24 ( $512^2$ ) and DIV2K ( $768^2$ ) datasets, in comparison with other zero-shot and trained methods. Table 1 at BPP $\approx 0.05$ reports:

Method	PSNR (dB)	LPIPS	FID	Time (s/img)
BPG	24.1	0.25	120	0.1
PerCo (SD)	25.6	0.15	22	1.0
DiffC	25.2	0.18	30	10
DDCM	24.8	0.20	45	65
Turbo-DDCM	25.3	0.17	20	1.5

Turbo-DDCM matches or surpasses all zero-shot methods in PSNR, LPIPS, and FID, while running 3×–40× faster. Compared to trained models, only PerCo (SD) slightly outperforms in PSNR, but with lower perceptual quality (higher FID).

Ablation studies demonstrate:

Turbo-DDCM’s top- $M$ thresholded atom combinations result in equal or better angular alignment to the residual than DDCM MP, with performance continuing to improve as $M$ increases (unlike MP, which plateaus).
Empirical runtime scaling is $10^2$ – $10^4$ × faster than DDCM + MP across practical $(K, M, C)$ .

5. Limitations and Prospects

Turbo-DDCM introduces significant practical benefits, but several limitations remain:

At high bitrates, the codec’s underlying latent-diffusion backbone yields diminishing returns in PSNR, suggesting encoder/decoder distortion floors. End-to-end image-space training could potentially ameliorate this ceiling.
Current reverse processes require $\sim$ 20–30 diffusion steps; further efficiency gains may be possible by learning direct mappings from noisy latents to $x_0$ ("one-step zero-shot solvers").
A formal information-theoretic analysis of the limits and optimality of codebook-based zero-shot encoding remains open.

This suggests future research direction toward tighter latent/image coupling, adaptive codebook designs, and theoretical characterizations of zero-shot diffusion compression.

6. Context and Significance

Turbo-DDCM enables zero-shot diffusion-based compression to become a practical tool for both research and applied imaging, offering sub-2s per-image decode times and fine-grained control of bitrate, ROI, and distortion without dataset-specific training. The method’s closed-form multi-atom codebook selection and bit-efficient indexing protocol lead to an approximately 95% reduction in denoiser calls and a $\sim$ 40% BPP reduction over naïve bit-packing approaches, shifting the zero-shot diffusion codec paradigm from an academic concept toward routine deployment in bandwidth-sensitive imaging scenarios.

PDF Markdown Chat (Pro)

References (1)

Turbo-DDCM: Fast and Flexible Zero-Shot Diffusion-Based Image Compression (2025)

Turbo-DDCM: Efficient Diffusion Image Compression

1. Theoretical Foundations and Connection to DDCMs

2. Algorithmic Workflow and Bitstream Protocol

3. Flexible Compression Variants

4. Experimental Results and Benchmarks

5. Limitations and Prospects

6. Context and Significance

Whiteboard

Follow Topic

Continue Learning

Turbo-DDCM: Efficient Diffusion Image Compression

1. Theoretical Foundations and Connection to DDCMs

2. Algorithmic Workflow and Bitstream Protocol

3. Flexible Compression Variants

4. Experimental Results and Benchmarks

5. Limitations and Prospects

6. Context and Significance

Whiteboard

Follow Topic

Continue Learning

Related Topics