Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Semantic Ensemble Loss and Latent Refinement for High-Fidelity Neural Image Compression (2401.14007v2)

Published 25 Jan 2024 in eess.IV and cs.CV

Abstract: Recent advancements in neural compression have surpassed traditional codecs in PSNR and MS-SSIM measurements. However, at low bit-rates, these methods can introduce visually displeasing artifacts, such as blurring, color shifting, and texture loss, thereby compromising perceptual quality of images. To address these issues, this study presents an enhanced neural compression method designed for optimal visual fidelity. We have trained our model with a sophisticated semantic ensemble loss, integrating Charbonnier loss, perceptual loss, style loss, and a non-binary adversarial loss, to enhance the perceptual quality of image reconstructions. Additionally, we have implemented a latent refinement process to generate content-aware latent codes. These codes adhere to bit-rate constraints, balance the trade-off between distortion and fidelity, and prioritize bit allocation to regions of greater importance. Our empirical findings demonstrate that this approach significantly improves the statistical fidelity of neural image compression. On CLIC2024 validation set, our approach achieves a 62% bitrate saving compared to MS-ILLM under FID metric.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (24)
  1. “Learning scalable l∞\infty∞-constrained near-lossless image compression via joint lossy image and residual compression,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 11946–11955.
  2. “Deep lossy plus residual coding for lossless and near-lossless image compression,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024.
  3. “Learning lossless compression for high bit-depth medical imaging,” in 2023 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 2023, pp. 2549–2554.
  4. “ELIC: Efficient Learned Image Compression with Unevenly Grouped Space-Channel Contextual Adaptive Coding,” in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, June 2022, pp. 5708–5717, IEEE.
  5. “Overview of the Versatile Video Coding (VVC) Standard and its Applications,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 31, no. 10, pp. 3736–3764, Oct. 2021.
  6. “Generative Adversarial Networks for Extreme Learned Image Compression,” in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South), Oct. 2019, pp. 221–231, IEEE.
  7. “High-fidelity generative image compression,” Advances in Neural Information Processing Systems, vol. 33, pp. 11913–11924, 2020.
  8. “Perceptual learned image compression with continuous rate adaptation,” 4th Challenge on Learned Image Compression, vol. 2, no. 3, 2021.
  9. “PO-ELIC: Perception-Oriented Efficient Learned Image Coding,” in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), New Orleans, LA, USA, June 2022, pp. 1763–1768, IEEE.
  10. “Variational image compression with a scale hyperprior,” in International Conference on Learning Representations, 2018.
  11. “Improving Inference for Neural Image Compression,” in Advances in Neural Information Processing Systems. 2020, vol. 33, pp. 573–584, Curran Associates, Inc.
  12. “Flexible Neural Image Compression via Code Editing,” Advances in Neural Information Processing Systems, vol. 35, pp. 12184–12196, Dec. 2022.
  13. “Rethinking Lossy Compression: The Rate-Distortion-Perception Tradeoff,” in Proceedings of the 36th International Conference on Machine Learning. May 2019, pp. 675–685, PMLR.
  14. “Improving Statistical Fidelity for Neural Image Compression with Implicit Local Likelihood Models,” in Proceedings of the 40th International Conference on Machine Learning. July 2023, pp. 25426–25443, PMLR.
  15. “Deep laplacian pyramid networks for fast and accurate super-resolution,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 624–632.
  16. “The unreasonable effectiveness of deep features as a perceptual metric,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 586–595.
  17. “Enhancenet: Single image super-resolution through automated texture synthesis,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 4491–4500.
  18. “Segment Anything,” Apr. 2023.
  19. OpenImages: A Public Dataset for Large-Scale Multi-Label and Multi-Class Image Classification., Jan. 2016.
  20. “Variable Rate ROI Image Compression Optimized for Visual Quality,” in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Nashville, TN, USA, June 2021, pp. 1936–1940, IEEE.
  21. “Perceptual image compression with controllable region quality,” 2022, 5th Challenge on Learned Image Compression.
  22. “Image Quality Assessment: Unifying Structure and Texture Similarity,” IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1–1, 2020.
  23. “Gans trained by a two time-scale update rule converge to a local nash equilibrium,” Advances in neural information processing systems, vol. 30, 2017.
  24. “Demystifying MMD GANs,” in International Conference for Learning Representations, 2018, pp. 1–36.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Daxin Li (4 papers)
  2. Yuanchao Bai (19 papers)
  3. Kai Wang (624 papers)
  4. Junjun Jiang (97 papers)
  5. Xianming Liu (121 papers)
Citations (1)