Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 97 tok/s
Gemini 2.5 Pro 49 tok/s Pro
GPT-5 Medium 21 tok/s Pro
GPT-5 High 18 tok/s Pro
GPT-4o 92 tok/s Pro
GPT OSS 120B 468 tok/s Pro
Kimi K2 175 tok/s Pro
2000 character limit reached

Exploring Compressed Image Representation as a Perceptual Proxy: A Study (2401.07200v1)

Published 14 Jan 2024 in cs.CV, cs.LG, and eess.IV

Abstract: We propose an end-to-end learned image compression codec wherein the analysis transform is jointly trained with an object classification task. This study affirms that the compressed latent representation can predict human perceptual distance judgments with an accuracy comparable to a custom-tailored DNN-based quality metric. We further investigate various neural encoders and demonstrate the effectiveness of employing the analysis transform as a perceptual loss network for image tasks beyond quality judgments. Our experiments show that the off-the-shelf neural encoder proves proficient in perceptual modeling without needing an additional VGG network. We expect this research to serve as a valuable reference developing of a semantic-aware and coding-efficient neural encoder.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (23)
  1. “Using goal-driven deep learning models to understand sensory cortex,” Nature neuroscience, vol. 19, no. 3, pp. 356–365, 2016.
  2. “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
  3. “A neural algorithm of artistic style,” arXiv preprint arXiv:1508.06576, 2015.
  4. “Perceptual losses for real-time style transfer and super-resolution,” in Proceedings of the European Conference on Computer Vision (ECCV). Springer, 2016, pp. 694–711.
  5. “Photo-realistic single image super-resolution using a generative adversarial network,” in CVPR, 2017, pp. 4681–4690.
  6. “The unreasonable effectiveness of deep features as a perceptual metric,” in CVPR, 2018.
  7. “Cpips: Learning to preserve perceptual distances in end-to-end image compression,” in 2023 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2023.
  8. “Variational image compression with a scale hyperprior,” arXiv preprint arXiv:1802.01436, 2018.
  9. “Joint autoregressive and hierarchical priors for learned image compression,” Advances in Neural Information Processing Systems, vol. 31, pp. 10771–10780, 2018.
  10. “Hevc still image coding and high efficiency image file format,” in 2016 IEEE International Conference on Image Processing (ICIP). IEEE, 2016, pp. 71–75.
  11. “Learned image compression with discretized gaussian mixture likelihoods and attention modules,” in CVPR, 2020, pp. 7939–7948.
  12. “Versatile video coding–towards the next generation of video compression,” in Picture Coding Symposium, 2018, vol. 2018.
  13. “Deep architectures for image compression: a critical review,” Signal Processing, vol. 191, pp. 108346, 2022.
  14. “An introduction to neural data compression,” arXiv preprint arXiv:2202.06533, 2022.
  15. “Image quality assessment: from error visibility to structural similarity,” IEEE transactions on image processing, vol. 13, no. 4, pp. 600–612, 2004.
  16. “Gradient magnitude similarity deviation: A highly efficient perceptual image quality index,” IEEE transactions on image processing, vol. 23, no. 2, pp. 684–695, 2013.
  17. “Perceptual image quality assessment using a normalized laplacian pyramid,” Electronic Imaging, vol. 2016, no. 16, pp. 1–6, 2016.
  18. “An unsupervised information-theoretic perceptual quality metric,” Advances in Neural Information Processing Systems, vol. 33, pp. 13–24, 2020.
  19. “Image quality assessment: Unifying structure and texture similarity,” IEEE transactions on pattern analysis and machine intelligence, vol. 44, no. 5, pp. 2567–2581, 2020.
  20. “Clic 2021: Workshop and challenge on learned image compression,” https://clic.compression.cc/2021/tasks/index.html.
  21. “X-gans: Image reconstruction made easy for extreme cases,” arXiv preprint arXiv:1808.04432, 2018.
  22. “Nonlinear transform coding,” IEEE Journal of Selected Topics in Signal Processing, vol. 15, no. 2, pp. 339–353, 2020.
  23. “Call for evidence for video coding for machines,” ISO/IEC JTC 1/SC 29/WG 2, 2020.
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Ai Generate Text Spark Streamline Icon: https://streamlinehq.com

Paper Prompts

Sign up for free to create and run prompts on this paper using GPT-5.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.