Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Computation and Critical Transitions of Rate-Distortion-Perception Functions With Wasserstein Barycenter (2404.04681v3)

Published 6 Apr 2024 in cs.IT and math.IT

Abstract: The information rate-distortion-perception (RDP) function characterizes the three-way trade-off between description rate, average distortion, and perceptual quality measured by discrepancy between probability distributions and has been applied to emerging areas in communications empowered by generative modeling. We study several variants of the RDP functions through the lens of optimal transport to characterize their critical transitions. By transforming the information RDP function into a Wasserstein Barycenter problem, we identify the critical transitions when one of the constraints becomes inactive. Further, the non-strictly convexity brought by the perceptual constraint can be regularized by an entropy regularization term. We prove that the entropy regularized model converges to the original problem and propose an alternating iteration method based on the Sinkhorn algorithm to numerically solve the regularized optimization problem. In many practical scenarios, the computation of the Distortion-Rate-Perception (DRP) function offers a solution to minimize distortion and perceptual discrepancy under rate constraints. However, the interchange of the rate objective and the distortion constraint significantly amplifies the complexity. The proposed method effectively addresses this complexity, providing an efficient solution for DRP functions. Using our numerical method, we propose a reverse data hiding scheme that imperceptibly embeds a secret message into an image, ensuring perceptual fidelity and achieving a significant improvement in the perceptual quality of the stego image compared to traditional methods under the same embedding rate. Our theoretical results and numerical method lay an attractive foundation for steganographic communications with perceptual quality constraints.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (56)
  1. F. Mentzer, G. Toderici, M. Tschannen, and E. Agustsson, “High-fidelity generative image compression,” in Advances in Neural Information Processing Systems (NeurIPS 2020), H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, Eds., vol. 33, Dec. 2020, pp. 11 913–11 924.
  2. S. Ma, X. Zhang, C. Jia, Z. Zhao, S. Wang, and S. Wang, “Image and video compression with neural networks: A review,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 30, no. 6, pp. 1683–1698, 2020.
  3. G. Lu, X. Zhang, W. Ouyang, L. Chen, Z. Gao, and D. Xu, “An end-to-end learning framework for video compression,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 10, pp. 3292–3308, 2020.
  4. N. Zeghidour, A. Luebs, A. Omran, J. Skoglund, and M. Tagliasacchi, “Soundstream: An end-to-end neural audio codec,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 30, pp. 495–507, 2022.
  5. C. E. Shannon et al., “Coding Theorems for a Discrete Source with a Fidelity Criterion,” Institute of Radio Engineers International Convention Record, vol. 4, no. 142-163, p. 1, Mar. 1959.
  6. S. Santurkar, D. M. Budden, and N. Shavit, “Generative Compression,” in 2018 Picture Coding Symposium (PCS), San Francisco, California, USA, Jun. 2018, pp. 1–5.
  7. E. Agustsson, M. Tschannen, F. Mentzer, R. Timofte, and L. V. Gool, “Generative Adversarial Networks for Extreme Learned Image Compression,” in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, South Korea, Oct.-Nov. 2019, pp. 221–231.
  8. Y. Blau and T. Michaeli, “The Perception-Distortion Tradeoff,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, Utah, USA, Jun. 2018, pp. 6228–6237.
  9. D. Freirich, N. Weinberger, and R. Meir, “The distortion-perception tradeoff in finite channels with arbitrary distortion measures,” in NeurIPS 2023 workshop: Information-Theoretic Principles in Cognitive Systems, 2023.
  10. Y. Blau and T. Michaeli, “Rethinking Lossy Compression: The Rate-Distortion-Perception Tradeoff,” in 36th International Conference on Machine Learning (ICML), Long Beach, California, USA, Jun. 2019, pp. 675–685.
  11. G. Zhang, J. Qian, J. Chen, and A. Khisti, “Universal Rate-Distortion-Perception Representations for Lossy Compression,” in Advances in Neural Information Processing Systems (NIPS 2021), vol. 34, Dec. 2021, pp. 11 517–11 529.
  12. R. Matsumoto, “Introducing the perception-distortion tradeoff into the rate-distortion theory of general information sources,” IEICE Communications Express, vol. 7, no. 11, pp. 427–431, 2018.
  13. L. Theis and A. B. Wagner, “A coding theorem for the rate-distortion-perception function,” in Neural Compression: From Information Theory to Applications–Workshop@ ICLR 2021, 2021.
  14. A. B. Wagner, “The rate-distortion-perception tradeoff: The role of common randomness,” arXiv preprint arXiv:2202.04147, 2022.
  15. J. Chen, L. Yu, J. Wang, W. Shi, Y. Ge, and W. Tong, “On the rate-distortion-perception function,” IEEE Journal on Selected Areas in Information Theory, 2022.
  16. X. Niu, D. Gündüz, B. Bai, and W. Han, “Conditional rate-distortion-perception trade-off,” in 2023 IEEE International Symposium on Information Theory (ISIT).   IEEE, 2023, pp. 1074–1079.
  17. Y. Hamdi and D. Gündüz, “The rate-distortion-perception trade-off with side information,” in 2023 IEEE International Symposium on Information Theory (ISIT).   IEEE, 2023, pp. 1056–1061.
  18. S. Salehkalaibar, J. Chen, A. Khisti, and W. Yu, “Rate-distortion-perception tradeoff based on the conditional-distribution perception measure,” arXiv preprint arXiv:2401.12207, 2024.
  19. S. Arimoto, “An Algorithm for Computing the Capacity of Arbitrary Discrete Memoryless Channels,” IEEE Transactions on Information Theory, vol. 18, no. 1, pp. 14–20, Jan. 1972.
  20. R. E. Blahut, “Computation of Channel Capacity and Rate-Distortion Functions,” IEEE Transactions on Information Theory, vol. 18, no. 4, pp. 460–473, Jan. 1972.
  21. G. Serra, P. A. Stavrou, and M. Kountouris, “Computation of rate-distortion-perception function under f-divergence perception constraints,” in 2023 IEEE International Symposium on Information Theory (ISIT), 2023, pp. 531–536.
  22. S. Wu, W. Ye, H. Wu, H. Wu, W. Zhang, and B. Bai, “A communication optimal transport approach to the computation of rate distortion functions,” in 2023 IEEE Information Theory Workshop (ITW), 2023, pp. 92–96.
  23. B. Pass, “Multi-marginal Optimal Transport: Theory and Applications,” ESAIM: Mathematical Modelling and Numerical Analysis, vol. 49, no. 6, pp. 1771–1790, Feb. 2015.
  24. M. Cuturi and A. Doucet, “Fast Computation of Wasserstein Barycenters,” in 31th International Conference on Machine Learning (ICML), vol. 32, Beijing, China, Jun. 2014, pp. 685–693.
  25. M. Agueh and G. Carlier, “Barycenters in the Wasserstein Space,” SIAM Journal on Mathematical Analysis, vol. 43, no. 2, pp. 904–924, Jan. 2011.
  26. D. Volkhonskiy, I. Nazarov, and E. Burnaev, “Steganographic generative adversarial networks,” in Twelfth international conference on machine vision (ICMV 2019), vol. 11433.   SPIE, 2020, pp. 991–1005.
  27. R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang, “The unreasonable effectiveness of deep features as a perceptual metric,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 586–595.
  28. Y. Blau and T. Michaeli, “The perception-distortion tradeoff,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 6228–6237.
  29. P. W. Cuff, H. H. Permuter, and T. M. Cover, “Coordination capacity,” IEEE Transactions on Information Theory, vol. 56, no. 9, pp. 4181–4206, 2010.
  30. M. Arjovsky, S. Chintala, and L. Bottou, “Wasserstein generative adversarial networks,” in International conference on machine learning.   PMLR, 2017, pp. 214–223.
  31. A. Baradat and L. Monsaingeon, “Small noise limit and convexity for generalized incompressible flows, schrödinger problems, and optimal transport,” Archive for Rational Mechanics and Analysis, vol. 235, no. 2, pp. 1357–1403, 2020.
  32. G. Buttazzo, L. De Pascale, and P. Gori-Giorgi, “Optimal-transport formulation of electronic density-functional theory,” Physical Review A, vol. 85, no. 6, p. 062502, 2012.
  33. I. Abraham, R. Abraham, M. Bergounioux, and G. Carlier, “Tomographic reconstruction from a few views: a multi-marginal optimal transport approach,” Applied Mathematics & Optimization, vol. 75, no. 1, pp. 55–73, 2017.
  34. O. Pele and M. Werman, “Fast and robust earth mover’s distances,” in 2009 IEEE 12th international conference on computer vision.   IEEE, 2009, pp. 460–467.
  35. F. Santambrogio, “Optimal transport for applied mathematicians,” Birkäuser, NY, vol. 55, no. 58-63, p. 94, 2015.
  36. P. L. Combettes and J.-C. Pesquet, “Proximal splitting methods in signal processing,” Fixed-point algorithms for inverse problems in science and engineering, pp. 185–212, 2011.
  37. J.-D. Benamou, B. D. Froese, and A. M. Oberman, “Numerical solution of the optimal transportation problem using the monge–ampère equation,” Journal of Computational Physics, vol. 260, pp. 107–126, 2014.
  38. B. D. Froese and A. M. Oberman, “Convergent finite difference solvers for viscosity solutions of the elliptic monge–ampère equation in dimensions two and higher,” SIAM Journal on Numerical Analysis, vol. 49, no. 4, pp. 1692–1714, 2011.
  39. M. Nutz and J. Wiesel, “Entropic Optimal Transport: Convergence of Potentials,” Probability Theory and Related Fields, vol. 184, no. 1, pp. 401–424, Nov. 2022.
  40. G. Peyré, M. Cuturi et al., “Computational Optimal Transport: With Applications to Data Science,” Foundations and Trends® in Machine Learning, vol. 11, no. 5-6, pp. 355–607, Feb. 2019.
  41. G. Carlier, A. Oberman, and E. Oudet, “Numerical methods for matching for teams and wasserstein barycenters,” ESAIM: Mathematical Modelling and Numerical Analysis, vol. 49, no. 6, pp. 1621–1642, 2015.
  42. J.-D. Benamou, G. Carlier, M. Cuturi, L. Nenna, and G. Peyré, “Iterative bregman projections for regularized transportation problems,” SIAM Journal on Scientific Computing, vol. 37, no. 2, pp. A1111–A1138, 2015.
  43. W. Ye, H. Wu, S. Wu, Y. Wang, W. Zhang, H. Wu, and B. Bai, “An Optimal Transport Approach to the Computation of the LM Rate,” in 2022 IEEE Global Communications Conference (GLOBECOM), Rio de Janeiro, Brazil, Dec. 2022, pp. 239–244.
  44. Y. Bai, X. Wu, and A. Özgür, “Information constrained optimal transport: From talagrand, to marton, to cover,” IEEE Transactions on Information Theory, vol. 69, no. 4, pp. 2059–2073, 2023.
  45. M. Mozaffari, W. Saad, M. Bennis, and M. Debbah, “Wireless communication using unmanned aerial vehicles (uavs): Optimal transport theory for hover time optimization,” IEEE Transactions on Wireless Communications, vol. 16, no. 12, pp. 8052–8066, 2017.
  46. L. Chen, S. Wu, W. Ye, H. Wu, H. Wu, W. Zhang, B. Bai, and Y. Sun, “Information bottleneck revisited: Posterior probability perspective with optimal transport,” 2023.
  47. G. Kramer and S. A. Savari, “Communicating probability distributions,” IEEE Transactions on Information Theory, vol. 53, no. 2, pp. 518–525, 2007.
  48. E. Mariucci and M. Reiß, “Wasserstein and total variation distance between marginals of lévy processes,” arXiv preprint arXiv:1710.02715, 2017.
  49. J. M. Altschuler and E. Boix-Adsera, “Wasserstein barycenters are np-hard to compute,” SIAM Journal on Mathematics of Data Science, vol. 4, no. 1, pp. 179–203, 2022.
  50. C. Chen, X. Niu, W. Ye, S. Wu, B. Bai, W. Chen, and S.-J. Lin, “Computation of rate-distortion-perception functions with wasserstein barycenter,” in 2023 IEEE International Symposium on Information Theory (ISIT), 2023, pp. 1074–1079.
  51. N. Subramanian, O. Elharrouss, S. Al-Maadeed, and A. Bouridane, “Image steganography: A review of the recent advances,” IEEE Access, vol. 9, pp. 23 409–23 423, 2021.
  52. K. Chen, H. Zhou, H. Zhao, D. Chen, W. Zhang, and N. Yu, “Distribution-preserving steganography based on text-to-speech generative models,” IEEE Transactions on Dependable and Secure Computing, vol. 19, no. 5, pp. 3343–3356, 2021.
  53. Z. Ni, Y.-Q. Shi, N. Ansari, and W. Su, “Reversible data hiding,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 16, no. 3, pp. 354–362, 2006.
  54. P. Tsai, Y.-C. Hu, and H.-L. Yeh, “Reversible image hiding scheme using predictive coding and histogram shifting,” Signal processing, vol. 89, no. 6, pp. 1129–1143, 2009.
  55. S.-J. Lin and W.-H. Chung, “The scalar scheme for reversible information-embedding in gray-scale signals: Capacity evaluation and code constructions,” IEEE Transactions on Information Forensics and Security, vol. 7, no. 4, pp. 1155–1167, 2012.
  56. T. Kalker and F. M. Willems, “Capacity bounds and constructions for reversible data-hiding,” in 2002 14th International Conference on Digital Signal Processing Proceedings. DSP 2002 (Cat. No. 02TH8628), vol. 1.   IEEE, 2002, pp. 71–76.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Chunhui Chen (15 papers)
  2. Xueyan Niu (15 papers)
  3. Wenhao Ye (10 papers)
  4. Hao Wu (623 papers)
  5. Bo Bai (71 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com