Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 97 tok/s
Gemini 2.5 Pro 49 tok/s Pro
GPT-5 Medium 21 tok/s Pro
GPT-5 High 18 tok/s Pro
GPT-4o 92 tok/s Pro
GPT OSS 120B 468 tok/s Pro
Kimi K2 175 tok/s Pro
2000 character limit reached

The Unreasonable Effectiveness of Linear Prediction as a Perceptual Metric (2310.05986v1)

Published 6 Oct 2023 in cs.CV

Abstract: We show how perceptual embeddings of the visual system can be constructed at inference-time with no training data or deep neural network features. Our perceptual embeddings are solutions to a weighted least squares (WLS) problem, defined at the pixel-level, and solved at inference-time, that can capture global and local image characteristics. The distance in embedding space is used to define a perceptual similarity metric which we call LASI: Linear Autoregressive Similarity Index. Experiments on full-reference image quality assessment datasets show LASI performs competitively with learned deep feature based methods like LPIPS (Zhang et al., 2018) and PIM (Bhardwaj et al., 2020), at a similar computational cost to hand-crafted methods such as MS-SSIM (Wang et al., 2003). We found that increasing the dimensionality of the embedding space consistently reduces the WLS loss while increasing performance on perceptual tasks, at the cost of increasing the computational complexity. LASI is fully differentiable, scales cubically with the number of embedding dimensions, and can be parallelized at the pixel-level. A Maximum Differentiation (MAD) competition (Wang & Simoncelli, 2008) between LASI and LPIPS shows that both methods are capable of finding failure points for the other, suggesting these metrics can be combined.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (27)
  1. End-to-end optimized image compression. arXiv preprint arXiv:1611.01704, 2016.
  2. Variational image compression with a scale hyperprior. arXiv preprint arXiv:1802.01436, 2018.
  3. Optimal attentional allocation in the presence of capacity constraints in visual search. 2020a.
  4. Efficient data compression in perception and perceptual memory. Psychological review, 127(5):891, 2020b.
  5. Representation and computation in working memory. 2022.
  6. An unsupervised information-theoretic perceptual quality metric. Advances in Neural Information Processing Systems, 33:13–24, 2020.
  7. Pattern recognition and machine learning, volume 4. Springer, 2006.
  8. JAX: composable transformations of Python+NumPy programs, 2018. URL http://github.com/google/jax.
  9. Compression in visual working memory: using statistical regularities to form more efficient memory representations. Journal of Experimental Psychology: General, 138(4):487, 2009.
  10. Thomas M Cover. Elements of information theory. John Wiley & Sons, 1999.
  11. Comparison of full-reference image quality models for optimization of image processing systems. International Journal of Computer Vision, 129:1258–1281, 2021.
  12. Quantifying visual image quality: A bayesian view. Annual Review of Vision Science, 7:437–464, 2021.
  13. Bernd Girod. What’s wrong with mean squared error?, pp.  207–220. MIT Press, 1993. ISBN 0-262-23171-9.
  14. The singular value decomposition: Its computation and some applications. IEEE Transactions on automatic control, 25(2):164–176, 1980.
  15. Do better imagenet classifiers assess perceptual similarity better? Transactions of Machine Learning Research, 2022.
  16. Conviqt: Contrastive video quality estimator. arXiv preprint arXiv:2206.14713, 2022a.
  17. Image quality assessment using contrastive learning. IEEE Transactions on Image Processing, 31:4149–4161, 2022b.
  18. Glicbawls-grey level image compression by adaptive weighted least squares. In Data Compression Conference, volume 503, 2001.
  19. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
  20. Chris R Sims. Rate–distortion theory and human perception. Cognition, 152:181–198, 2016.
  21. Chris R Sims. Efficient coding explains the universal law of generalization in human perception. Science, 360(6389):652–656, 2018.
  22. Maximum differentiation (mad) competition: A methodology for comparing computational models of perceptual quantities. Journal of Vision, 8(12):8–8, 2008.
  23. Multiscale structural similarity for image quality assessment. In The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003, volume 2, pp.  1398–1402. Ieee, 2003.
  24. Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing, 13(4):600–612, 2004.
  25. Contrastive distortion-level learning-based no-reference image-quality assessment. International Journal of Intelligent Systems, 37(11):8730–8746, 2022.
  26. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.  586–595, 2018.
  27. From distance to dependency: A paradigm shift of full-reference image quality assessment. arXiv preprint arXiv:2211.04927, 2022.
Citations (1)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Ai Generate Text Spark Streamline Icon: https://streamlinehq.com

Paper Prompts

Sign up for free to create and run prompts on this paper using GPT-5.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.