The Unreasonable Effectiveness of Linear Prediction as a Perceptual Metric (2310.05986v1)
Abstract: We show how perceptual embeddings of the visual system can be constructed at inference-time with no training data or deep neural network features. Our perceptual embeddings are solutions to a weighted least squares (WLS) problem, defined at the pixel-level, and solved at inference-time, that can capture global and local image characteristics. The distance in embedding space is used to define a perceptual similarity metric which we call LASI: Linear Autoregressive Similarity Index. Experiments on full-reference image quality assessment datasets show LASI performs competitively with learned deep feature based methods like LPIPS (Zhang et al., 2018) and PIM (Bhardwaj et al., 2020), at a similar computational cost to hand-crafted methods such as MS-SSIM (Wang et al., 2003). We found that increasing the dimensionality of the embedding space consistently reduces the WLS loss while increasing performance on perceptual tasks, at the cost of increasing the computational complexity. LASI is fully differentiable, scales cubically with the number of embedding dimensions, and can be parallelized at the pixel-level. A Maximum Differentiation (MAD) competition (Wang & Simoncelli, 2008) between LASI and LPIPS shows that both methods are capable of finding failure points for the other, suggesting these metrics can be combined.
- End-to-end optimized image compression. arXiv preprint arXiv:1611.01704, 2016.
- Variational image compression with a scale hyperprior. arXiv preprint arXiv:1802.01436, 2018.
- Optimal attentional allocation in the presence of capacity constraints in visual search. 2020a.
- Efficient data compression in perception and perceptual memory. Psychological review, 127(5):891, 2020b.
- Representation and computation in working memory. 2022.
- An unsupervised information-theoretic perceptual quality metric. Advances in Neural Information Processing Systems, 33:13–24, 2020.
- Pattern recognition and machine learning, volume 4. Springer, 2006.
- JAX: composable transformations of Python+NumPy programs, 2018. URL http://github.com/google/jax.
- Compression in visual working memory: using statistical regularities to form more efficient memory representations. Journal of Experimental Psychology: General, 138(4):487, 2009.
- Thomas M Cover. Elements of information theory. John Wiley & Sons, 1999.
- Comparison of full-reference image quality models for optimization of image processing systems. International Journal of Computer Vision, 129:1258–1281, 2021.
- Quantifying visual image quality: A bayesian view. Annual Review of Vision Science, 7:437–464, 2021.
- Bernd Girod. What’s wrong with mean squared error?, pp. 207–220. MIT Press, 1993. ISBN 0-262-23171-9.
- The singular value decomposition: Its computation and some applications. IEEE Transactions on automatic control, 25(2):164–176, 1980.
- Do better imagenet classifiers assess perceptual similarity better? Transactions of Machine Learning Research, 2022.
- Conviqt: Contrastive video quality estimator. arXiv preprint arXiv:2206.14713, 2022a.
- Image quality assessment using contrastive learning. IEEE Transactions on Image Processing, 31:4149–4161, 2022b.
- Glicbawls-grey level image compression by adaptive weighted least squares. In Data Compression Conference, volume 503, 2001.
- Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
- Chris R Sims. Rate–distortion theory and human perception. Cognition, 152:181–198, 2016.
- Chris R Sims. Efficient coding explains the universal law of generalization in human perception. Science, 360(6389):652–656, 2018.
- Maximum differentiation (mad) competition: A methodology for comparing computational models of perceptual quantities. Journal of Vision, 8(12):8–8, 2008.
- Multiscale structural similarity for image quality assessment. In The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003, volume 2, pp. 1398–1402. Ieee, 2003.
- Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing, 13(4):600–612, 2004.
- Contrastive distortion-level learning-based no-reference image-quality assessment. International Journal of Intelligent Systems, 37(11):8730–8746, 2022.
- The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 586–595, 2018.
- From distance to dependency: A paradigm shift of full-reference image quality assessment. arXiv preprint arXiv:2211.04927, 2022.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.