Turbo-DDCM: Efficient Diffusion Image Compression
- Turbo-DDCM is an efficient zero-shot diffusion-based image compression method that employs closed‐form multi-atom codebook selection to reduce computational overhead.
- It substantially cuts reverse diffusion steps by up to 95%, while offering priority-aware and distortion-controlled variants for enhanced region-of-interest fidelity and targeted PSNR control.
- Experimental results show Turbo-DDCM achieves competitive PSNR, LPIPS, and FID metrics at dramatically lower runtimes compared to traditional diffusion codecs.
Turbo-DDCM is an efficient and flexible zero-shot diffusion-based image compression methodology that advances prior Denoising Diffusion Codebook Models (DDCMs) by introducing a closed-form, multi-atom codebook selection and improved bitstream protocols. Turbo-DDCM substantially reduces the number of reverse diffusion steps required for image reconstruction, thereby enabling orders-of-magnitude speedup over existing zero-shot diffusion codecs, while retaining competitive perceptual and distortion metrics against state-of-the-art methods. The design offers two notable variants—priority-aware and distortion-controlled compression—that allow explicit user control over region-of-interest and distortion targets within the zero-shot paradigm (Vaisman et al., 9 Nov 2025).
1. Theoretical Foundations and Connection to DDCMs
Turbo-DDCM builds upon the paradigm of denoising diffusion probabilistic models (DDPMs), leveraging their iterative noising and denoising stochastic processes. The forward process applies a sequence of Gaussian noise injections: with encoding the cumulative product of noise variance schedules.
The DDCM framework replaces the random Gaussian noise in the reverse generative step,
with disambiguating noise vectors from a reproducible codebook of atoms, enabling discrete bit-indexed control over the latent diffusion trajectory. Standard DDCM with selects at each step the codebook atom maximizing alignment with the stepwise residual , storing bits per denoising step. Multi-atom DDCM generalizes this via matching pursuit (MP), selecting atoms per step in a sequential manner, but at significant computational cost.
Turbo-DDCM advances this by performing simultaneous closed-form selection of atoms, optimizing
where collects the codebook atoms and is a quantized coefficient set. Under near-orthogonality (typical of i.i.d. Gaussian codebooks), this reduces to top- selection by the absolute inner product with the residual, assigning as the coefficient for the selected atoms (), followed by normalization. This eliminates the need for iterative search and drastically reduces the required number of denoiser calls.
2. Algorithmic Workflow and Bitstream Protocol
Turbo-DDCM’s encoding and decoding process consists of:
- Encoding steps ():
- Compute the residual .
- Calculate inner products for all codebook atoms.
- Select the top- indices in .
- Set coefficients for selected , else 0.
- Serialize the subset index as its lexicographic rank ( bits) and the coefficient values ( bits).
- Update by injecting the normalized linear combination of atoms: .
- Bit Protocol:
- Transmit lexicographic rank of subset (size from ), not ordered indices, eliminating permutation redundancy.
- For steps, total bits per pixel:
- No bits are transmitted for the final deterministic DDIM steps, which conclude image restoration.
Computational Complexity:
- Each encoding/decoding step costs , enabling feasible settings (, ) with total –30, i.e., a 95% denoiser call reduction from DDCM (–1000).
- Decoding: Follows the identical reverse process, reconstructing with codebook indices and coefficients provided by the bitstream.
3. Flexible Compression Variants
Turbo-DDCM supports two algorithmic extensions:
Priority-Aware Turbo-DDCM:
Allows explicit focus on user-specified image regions of interest (ROI), by replacing the residuals in step (5) with a masked version , . This increases atom allocation where fidelity is prioritized, without altering bitrate or runtime. Empirical results indicate substantial ROI fidelity gains at fixed BPP.
Distortion-Controlled Turbo-DDCM:
Addresses inherent PSNR variability at fixed bitrates in zero-shot compression by exploiting a strong empirical linear correlation between Turbo-DDCM’s PSNR and the lossless JPEG compressed file size: Linear predictors trained across BPPs guide per-image selection of minimal achieving a target PSNR. On test images, this reduces PSNR root-mean-square error by over 40% compared to naïve BPP selection.
4. Experimental Results and Benchmarks
Turbo-DDCM was evaluated on the Kodak24 () and DIV2K () datasets, in comparison with other zero-shot and trained methods. Table 1 at BPP reports:
| Method | PSNR (dB) | LPIPS | FID | Time (s/img) |
|---|---|---|---|---|
| BPG | 24.1 | 0.25 | 120 | 0.1 |
| PerCo (SD) | 25.6 | 0.15 | 22 | 1.0 |
| DiffC | 25.2 | 0.18 | 30 | 10 |
| DDCM | 24.8 | 0.20 | 45 | 65 |
| Turbo-DDCM | 25.3 | 0.17 | 20 | 1.5 |
Turbo-DDCM matches or surpasses all zero-shot methods in PSNR, LPIPS, and FID, while running 3×–40× faster. Compared to trained models, only PerCo (SD) slightly outperforms in PSNR, but with lower perceptual quality (higher FID).
Ablation studies demonstrate:
- Turbo-DDCM’s top- thresholded atom combinations result in equal or better angular alignment to the residual than DDCM MP, with performance continuing to improve as increases (unlike MP, which plateaus).
- Empirical runtime scaling is –× faster than DDCM + MP across practical .
5. Limitations and Prospects
Turbo-DDCM introduces significant practical benefits, but several limitations remain:
- At high bitrates, the codec’s underlying latent-diffusion backbone yields diminishing returns in PSNR, suggesting encoder/decoder distortion floors. End-to-end image-space training could potentially ameliorate this ceiling.
- Current reverse processes require 20–30 diffusion steps; further efficiency gains may be possible by learning direct mappings from noisy latents to ("one-step zero-shot solvers").
- A formal information-theoretic analysis of the limits and optimality of codebook-based zero-shot encoding remains open.
This suggests future research direction toward tighter latent/image coupling, adaptive codebook designs, and theoretical characterizations of zero-shot diffusion compression.
6. Context and Significance
Turbo-DDCM enables zero-shot diffusion-based compression to become a practical tool for both research and applied imaging, offering sub-2s per-image decode times and fine-grained control of bitrate, ROI, and distortion without dataset-specific training. The method’s closed-form multi-atom codebook selection and bit-efficient indexing protocol lead to an approximately 95% reduction in denoiser calls and a 40% BPP reduction over naïve bit-packing approaches, shifting the zero-shot diffusion codec paradigm from an academic concept toward routine deployment in bandwidth-sensitive imaging scenarios.
Sponsored by Paperpile, the PDF & BibTeX manager trusted by top AI labs.
Get 30 days free