Wasserstein Distortion: Unifying Fidelity and Realism (2310.03629v2)
Abstract: We introduce a distortion measure for images, Wasserstein distortion, that simultaneously generalizes pixel-level fidelity on the one hand and realism or perceptual quality on the other. We show how Wasserstein distortion reduces to a pure fidelity constraint or a pure realism constraint under different parameter choices and discuss its metric properties. Pairs of images that are close under Wasserstein distortion illustrate its utility. In particular, we generate random textures that have high fidelity to a reference texture in one location of the image and smoothly transition to an independent realization of the texture as one moves away from this point. Wasserstein distortion attempts to generalize and unify prior work on texture generation, image realism and distortion, and models of the early human visual system, in the form of an optimizable metric in the mathematical sense.
- Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: From error visibility to structural similarity,” IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600–612, 2004.
- I. Avcıbaş, B. Sankur, and K. Sayood, “Statistical evaluation of image quality measures,” Journal of Electronic Imaging, vol. 11, no. 2, pp. 206–223, 2002.
- R. Dosselmann and X. D. Yang, “Existing and emerging image quality metrics,” in Canadian Conference on Electrical and Computer Engineering, 2005.   IEEE, 2005, pp. 1906–1913.
- A. Hore and D. Ziou, “Image quality metrics: PSNR vs. SSIM,” in 2010 20th International Conference on Pattern Recognition.   IEEE, 2010, pp. 2366–2369.
- Z. Wang and A. C. Bovik, “Mean squared error: Love it or leave it? a new look at signal fidelity measures,” IEEE Signal Processing Magazine, vol. 26, no. 1, pp. 98–117, 2009.
- A. Buades, B. Coll, and J.-M. Morel, “A review of image denoising algorithms, with a new one,” Multiscale Modeling & Simulation, vol. 4, no. 2, pp. 490–530, 2005.
- S. Nah, S. Son, S. Lee, R. Timofte, and K. M. Lee, “NTIRE 2021 challenge on image deblurring,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 149–165.
- Y. Kwon, K. I. Kim, J. Tompkin, J. H. Kim, and C. Theobalt, “Efficient learning of image super-resolution and compression artifact removal with semi-local Gaussian processes,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 37, no. 9, pp. 1792–1805, 2015.
- Y. Blau and T. Michaeli, “The perception-distortion tradeoff,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 6228–6237.
- E. J. Delp and O. R. Mitchell, “Moment preserving quantization (signal processing),” IEEE Transactions on Communications, vol. 39, no. 11, pp. 1549–1558, 1991.
- M. Li, J. Klejsa, and W. B. Kleijn, “On distribution preserving quantization,” 2011.
- N. Saldi, T. Linder, and S. Yüksel, “Randomized quantization and source coding with constrained output distribution,” IEEE Transactions on Information Theory, vol. 61, no. 1, pp. 91–106, 2014.
- O. Rippel and L. Bourdev, “Real-time adaptive image compression,” in International Conference on Machine Learning.   PMLR, 06–11 Aug 2017, pp. 2922–2930. [Online]. Available: https://proceedings.mlr.press/v70/rippel17a.html
- M. Tschannen, E. Agustsson, and M. Lucic, “Deep generative models for distribution-preserving lossy compression,” Advances in Neural Information Processing Systems, vol. 31, 2018. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2018/file/801fd8c2a4e79c1d24a40dc735c051ae-Paper.pdf
- E. Agustsson, M. Tschannen, F. Mentzer, R. Timofte, and L. V. Gool, “Generative adversarial networks for extreme learned image compression,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 221–231.
- F. Mentzer, G. D. Toderici, M. Tschannen, and E. Agustsson, “High-fidelity generative image compression,” Advances in Neural Information Processing Systems, vol. 33, pp. 11 913–11 924, 2020. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2020/file/8a50bae297807da9e97722a0b3fd8f27-Paper.pdf
- J. Klejsa, G. Zhang, M. Li, and W. B. Kleijn, “Multiple description distribution preserving quantization,” IEEE Transactions on Signal Processing, vol. 61, no. 24, pp. 6410–6422, 2013.
- Y. Blau and T. Michaeli, “Rethinking lossy compression: The rate-distortion-perception tradeoff,” in Proceedings of the 36th International Conference on Machine Learning.   PMLR, 09–15 Jun 2019, pp. 675–685. [Online]. Available: https://proceedings.mlr.press/v97/blau19a.html
- R. Matsumoto, “Introducing the perception-distortion tradeoff into the rate-distortion theory of general information sources,” IEICE Communications Express, vol. 7, no. 11, pp. 427–431, 2018.
- ——, “Rate-distortion-perception tradeoff of variable-length source coding for general information sources,” IEICE Communications Express, vol. 8, no. 2, pp. 38–42, 2019.
- L. Theis and A. B. Wagner, “A coding theorem for the rate-distortion-perception function,” in Neural Compression: From Information Theory to Applications – Workshop @ ICLR 2021, 2021. [Online]. Available: https://openreview.net/forum?id=BzUaLGtKecs
- K. Chen, H. Zhou, H. Zhao, D. Chen, W. Zhang, and N. Yu, “Distribution-preserving steganography based on text-to-speech generative models,” IEEE Transactions on Dependable and Secure Computing, vol. 19, no. 5, pp. 3343–3356, 2021.
- J. Chen, L. Yu, J. Wang, W. Shi, Y. Ge, and W. Tong, “On the rate-distortion-perception function,” IEEE Journal on Selected Areas in Information Theory, 2022.
- A. B. Wagner, “The rate-distortion-perception tradeoff: The role of common randomness,” arXiv preprint arXiv:2202.04147, 2022.
- Y. Hamdi and D. Gündüz, “The rate-distortion-perception trade-off with side information,” in 2023 IEEE International Symposium on Information Theory (ISIT), 2023, pp. 1056–1061.
- G. Zhang, J. Qian, J. Chen, and A. Khisti, “Universal rate-distortion-perception representations for lossy compression,” Advances in Neural Information Processing Systems, vol. 34, pp. 11 517–11 529, 2021. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2021/file/5fde40544cff0001484ecae2466ce96e-Paper.pdf
- X. Niu, D. Gündüz, B. Bai, and W. Han, “Conditional rate-distortion-perception trade-off,” arXiv preprint arXiv:2305.09318, 2023.
- S. Salehkalaibar, B. Phan, A. Khisti, and W. Yu, “Rate-distortion-perception tradeoff based on the conditional perception measure,” in 2023 Biennial Symposium on Communications (BSC).   IEEE, 2023, pp. 31–37.
- L. Theis, T. Salimans, M. D. Hoffman, and F. Mentzer, “Lossy compression with Gaussian diffusion,” arXiv preprint arXiv:2206.08889, 2022.
- X. Wang, K. Yu, S. Wu, J. Gu, Y. Liu, C. Dong, Y. Qiao, and C. Change Loy, “ESRGAN: Enhanced super-resolution generative adversarial networks,” in Proceedings of the European Conference on Computer Vision (ECCV) Workshops, 2018, pp. 0–0.
- S. Gao, Y. Shi, T. Guo, Z. Qiu, Y. Ge, Z. Cui, Y. Feng, J. Wang, and B. Bai, “Perceptual learned image compression with continuous rate adaptation,” in 4th Challenge on Learned Image Compression, Jun 2021.
- J. Portilla and E. P. Simoncelli, “A parametric texture model based on joint statistics of complex wavelet coefficients,” International Journal of Computer Vision, vol. 40, pp. 49–70, 2000.
- B. Balas, L. Nakano, and R. Rosenholtz, “A summary-statistic representation in peripheral vision explains visual crowding,” Journal of Vision, vol. 9, no. 12, pp. 13–13, 2009.
- R. Rosenholtz, “What your visual system sees where you are not looking,” in Human Vision and Electronic Imaging XVI, vol. 7865.   SPIE, 2011, pp. 343–356.
- R. Rosenholtz, J. Huang, A. Raj, B. J. Balas, and L. Ilie, “A summary statistic representation in peripheral vision explains visual search,” Journal of Vision, vol. 12, no. 4, pp. 14–14, 2012.
- J. Freeman and E. P. Simoncelli, “Metamers of the ventral stream,” Nature Neuroscience, vol. 14, no. 9, pp. 1195–1201, 2011.
- Y. Qiu, A. B. Wagner, J. Ballé, and L. Theis, “Wasserstein distortion: Unifying fidelity and realism,” in 2024 58th Annual Conference on Information Sciences and Systems (CISS), 2024.
- Y. Qiu, A. B. Wagner, J. Ballé, and L. Theis, “Wasserstein distortion: Unifying fidelity and realism,” 2024. [Online]. Available: https://openreview.net/forum?id=ICDJDL5lmQ
- E. P. Simoncelli and W. T. Freeman, “The steerable pyramid: A flexible architecture for multi-scale derivative computation,” in Proceedings., International Conference on Image Processing, vol. 3, 1995, pp. 444–447 vol.3.
- I. Ustyuzhaninov, W. Brendel, L. Gatys, and M. Bethge, “What does it take to generate natural textures?” in International Conference on Learning Representations, 2017. [Online]. Available: https://openreview.net/forum?id=BJhZeLsxx
- L. Gatys, A. S. Ecker, and M. Bethge, “Texture synthesis using convolutional neural networks,” Advances in Neural Information Processing Systems, vol. 28, 2015. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2015/file/a5e00132373a7031000fd987a3c9f87b-Paper.pdf
- M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter, “GANs trained by a two time-scale update rule converge to a local nash equilibrium,” Advances in Neural Information Processing Systems, vol. 30, 2017. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2017/file/8a1d694707eb0fefe65871369074926d-Paper.pdf
- M. Lucic, K. Kurach, M. Michalski, S. Gelly, and O. Bousquet, “Are GANs created equal? a large-scale study,” Advances in Neural Information Processing Systems, vol. 31, 2018. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2018/file/e46de7e1bcaaced9a54f1e9d0d2f800d-Paper.pdf
- B. Liu, Y. Zhu, K. Song, and A. Elgammal, “Towards faster and stabilized GAN training for high-fidelity few-shot image synthesis,” in International Conference on Learning Representations, 2020. [Online]. Available: https://openreview.net/forum?id=1Fqg133qRaI
- J. Fan, S. Liu, S. Ma, Y. Chen, and H.-M. Zhou, “Scalable computation of monge maps with general costs,” in ICLR Workshop on Deep Generative Models for Highly Structured Data, 2022. [Online]. Available: https://openreview.net/forum?id=rEnGR3VdDW5
- I. Olkin and F. Pukelsheim, “The distance between two random vectors with given dispersion matrices,” Linear Algebra and its Applications, vol. 48, pp. 257–263, 1982.
- J. Vacher, A. Davila, A. Kohn, and R. Coen-Cagli, “Texture interpolation for probing visual perception,” in Advances in Neural Information Processing Systems, vol. 33, 2020, pp. 22 146–22 157. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2020/file/fba9d88164f3e2d9109ee770223212a0-Paper.pdf
- F. Pitié, A. Kokaram, and R. Dahyot, “n𝑛nitalic_n-dimensional probability density function transfer and its application to color transfer,” in Tenth IEEE International Conference on Computer Vision (ICCV’05) Volume 1, vol. 2, 2005, pp. 1434–1439.
- N. Bonneel, J. Rabin, G. Peyré, and H. Pfister, “Sliced and Radon Wasserstein barycenters of measures,” Journal of Mathematical Imaging and Vision, vol. 51, pp. 22–45, 2015.
- G. Tartavel, G. Peyré, and Y. Gousseau, “Wasserstein loss for image synthesis and restoration,” SIAM Journal on Imaging Sciences, vol. 9, no. 4, pp. 1726–1755, 2016.
- E. Heitz, K. Vanhoey, T. Chambon, and L. Belcour, “A sliced Wasserstein loss for neural texture synthesis,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2021, pp. 9412–9420.
- M. Cuturi, “Sinkhorn distances: Lightspeed computation of optimal transport,” Advances in Neural Information Processing Systems, vol. 26, 2013. [Online]. Available: https://proceedings.neurips.cc/paper/2013/hash/af21d0c97db2e27e13572cbf59eb343d-Abstract.html
- A. J. Smola, A. Gretton, and K. Borgwardt, “Maximum mean discrepancy,” in 13th International Conference, ICONIP, 2006, pp. 3–6.
- C.-L. Li, W.-C. Chang, Y. Cheng, Y. Yang, and B. Póczos, “MMD GAN: Towards deeper understanding of moment matching network,” Advances in Neural Information Processing Systems, vol. 30, 2017. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2017/file/dfd7468ac613286cdbb40872c8ef3b06-Paper.pdf
- C.-L. Li, W.-C. Chang, Y. Mroueh, Y. Yang, and B. Poczos, “Implicit kernel learning,” in The 22nd International Conference on Artificial Intelligence and Statistics.   PMLR, 16–18 Apr 2019, pp. 2007–2016. [Online]. Available: https://proceedings.mlr.press/v89/li19f.html
- Y. Rubner, C. Tomasi, and L. J. Guibas, “The earth mover’s distance as a metric for image retrieval,” International Journal of Computer Vision, vol. 40, no. 2, pp. 99–121, 2000.
- A. Elnekave and Y. Weiss, “Generating natural images with direct patch distributions matching,” in Proceedings of the European Conference on Computer Vision (ECCV) Workshops, 2022, pp. 544–560.
- A. Houdard, A. Leclaire, N. Papadakis, and J. Rabin, “A generative model for texture synthesis based on optimal transport between feature distributions,” Journal of Mathematical Imaging and Vision, vol. 65, pp. 4–28, 2023.
- K. Ding, K. Ma, S. Wang, and E. P. Simoncelli, “Comparison of full-reference image quality models for optimization of image processing systems,” International Journal of Computer Vision, vol. 129, pp. 1258–1281, 2021.
- C. Zhu, R. H. Byrd, P. Lu, and J. Nocedal, “Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound-constrained optimization,” ACM Transactions on Mathematical Software (TOMS), vol. 23, no. 4, pp. 550–560, 1997.
- K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” in 3rd International Conference on Learning Representations (ICLR 2015).   Computational and Biological Learning Society, 2015. [Online]. Available: https://doi.org/10.48550/arXiv.1409.1556
- J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and F.-F. Li, “ImageNet: A large-scale hierarchical image database,” in 2009 IEEE Conference on Computer Vision and Pattern Recognition.   IEEE, 2009, pp. 248–255.
- M. Jiang, S. Huang, J. Duan, and Q. Zhao, “SALICON: Saliency in context,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 1072–1080.
- S. O. Dumoulin and B. A. Wandell, “Population receptive field estimates in human visual cortex,” Neuroimage, vol. 39, no. 2, pp. 647–660, 2008.
- G. Liu, Y. Gousseau, and G.-S. Xia, “Texture synthesis through convolutional neural networks and spectrum constraints,” in 2016 23rd International Conference on Pattern Recognition (ICPR), 2016, pp. 3234–3239.
- X. Snelgrove, “High-resolution multi-scale neural texture synthesis,” in SIGGRAPH Asia 2017 Technical Briefs, 2017.
- O. Sendik and D. Cohen-Or, “Deep correlations for texture synthesis,” ACM Transactions on Graphics (ToG), vol. 36, no. 5, pp. 1–15, 2017.
- Y. Zhou, Z. Zhu, X. Bai, D. Lischinski, D. Cohen-Or, and H. Huang, “Non-stationary texture synthesis by adversarial expansion,” ACM Transactions on Graphics (ToG), vol. 37, no. 4, Jul 2018.
- N. Gonthier, Y. Gousseau, and S. Ladjal, “High-resolution neural texture synthesis with long-range constraints,” Journal of Mathematical Imaging and Vision, vol. 64, no. 5, pp. 478–492, 2022.
- R. Chellappa and R. L. Kashyap, “Texture synthesis using 2-D noncausal autoregressive models,” IEEE Trans. on Acoustics, Speech, and Signal Processing, vol. 33, no. 1, 1985.
- D. J. Heeger and J. R. Bergen, “Pyramid-based texture analysis/synthesis,” in SIGGRAPH ’95 Proc. of the 22nd Annual Conf. on Computer Graphics and Interactive Techniques, 1995.
- A. A. Efros and T. K. Leung, “Texture synthesis by non-parametric sampling,” in Proc. of IEEE Int. Conf. on Computer Vision ICCV, Sep. 1999, pp. 1033–1038.
- V. Kwatra, I. Essa, A. Bobick, and N. Kwatra, “Texture optimization for example-based synthesis,” in Proc. of Int. Conf. on Computer Graphics and Interactive Techniques SIGGRAPH, 2005, pp. 795–802.
- S. C. Zhu, Y. N. Wu, and D. Mumford, “Filters, random fields and maximum entropy (FRAME): Towards a unified theory for texture modeling,” International Journal of Computer Vision, vol. 27, no. 2, pp. 107–126, Nov. 1998.
- Y. Qiu and A. B. Wagner, “Low-rate, low-distortion compression with Wasserstein distortion,” arXiv preprint arXiv:2401.16858, 2024.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.