Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Conditional Residual Coding: A Remedy for Bottleneck Problems in Conditional Inter Frame Coding (2307.12864v2)

Published 24 Jul 2023 in eess.IV

Abstract: Conditional coding is a new video coding paradigm enabled by neural-network-based compression. It can be shown that conditional coding is in theory better than the traditional residual coding, which is widely used in video compression standards like HEVC or VVC. However, on closer inspection, it becomes clear that conditional coders can suffer from information bottlenecks in the prediction path, i.e., that due to the data processing inequality not all information from the prediction signal can be passed to the reconstructed signal, thereby impairing the coder performance. In this paper we propose the conditional residual coding concept, which we derive from information theoretical properties of the conditional coder. This coder significantly reduces the influence of bottlenecks, while maintaining the theoretical performance of the conditional coder. We provide a theoretical analysis of the coding paradigm and demonstrate the performance of the conditional residual coder in a practical example. We show that conditional residual coders alleviate the disadvantages of conditional coders while being able to maintain their advantages over residual coders. In the spectrum of residual and conditional coding, we can therefore consider them as ``the best from both worlds''.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (40)
  1. “Variable rate image compression with recurrent neural networks,” in Proc. International Conference on Learning Representations (ICLR), Y. Bengio and Y. LeCun, Eds., 2016.
  2. “End-to-end optimized image compression,” in Proc. International Conference on Learning Representations (ICLR), Apr. 2017, pp. 1–27.
  3. “Variational image compression with a scale hyperprior,” in Proc. International Conference on Learning Representations (ICLR), 2018, pp. 1–47.
  4. “DVC: An end-to-end deep video compression framework,” in Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019, pp. 10998–11007.
  5. “FVC: A new framework towards deep video compression in feature space,” in Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2021, pp. 1502–1511.
  6. “Overview of the high efficiency video coding (HEVC) standard,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 12, pp. 1649–1668, Dec. 2012.
  7. “Overview of the versatile video coding (VVC) standard and its applications,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 31, no. 10, pp. 3736–3764, Oct. 2021.
  8. “A technical overview of AV1,” Proceedings of the IEEE, vol. 109, no. 9, pp. 1–28, 2021.
  9. “Modenet: Mode selection network for learned video coding,” in Proc. IEEE International Workshop on Machine Learning for Signal Processing (MLSP), Sept. 2020, pp. 1–6.
  10. “Deep contextual video compression,” in Proc. Advances in Neural Information Processing Systems (NeurIPS), M. Ranzato, A. Beygelzimer, Y. Dauphin, P. Liang, and J. W. Vaughan, Eds. 2021, vol. 34, pp. 18114–18125, Curran Associates, Inc.
  11. “Temporal context mining for learned video compression,” IEEE Transactions on Multimedia, pp. 1–12, 2022.
  12. “Hybrid spatial-temporal entropy modelling for neural video compression,” in Proc. ACM International Conference on Multimedia, Oct. 2022, pp. 1503–1511.
  13. “Neural video compression with diverse contexts,” in Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2023, pp. 22616–22626.
  14. “VCT: A video compression transformer,” in Proc. Advances in Neural Information Processing Systems (NeurIPS), S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, Eds. Dec. 2022, vol. 35, pp. 13091–13103, Curran Associates, Inc.
  15. “Learning structured output representation using deep conditional generative models,” in Proc. Advances in Neural Information Processing Systems (NeurIPS), pp. 3483–3491. 2015.
  16. “Offline text-independent writer identification based on writer-independent model using conditional autoencoder,” in Proc. International Conference on Frontiers in Handwriting Recognition (ICFHR), Aug. 2018, pp. 441–446.
  17. “Intra-frame coding using a conditional autoencoder,” IEEE Journal of Selected Topics in Signal Processing, vol. 15, no. 2, pp. 354–365, Feb. 2021.
  18. “On benefits and challenges of conditional interframe video coding in light of information theory,” in Proc. Picture Coding Symposium (PCS), Dec. 2022, pp. 289–293.
  19. D. Slepian and J. Wolf, “Noiseless coding of correlated information sources,” IEEE Transactions on Information Theory, vol. 19, no. 4, pp. 471–480, 1973.
  20. “Scale-space flow for end-to-end optimized video compression,” in Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020, pp. 8503–8512.
  21. “Coarse-to-fine deep video coding with hyperprior-guided mode prediction,” in Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2022, pp. 5921–5930.
  22. “DMVC: Decomposed motion modeling for learned video compression,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 33, no. 7, pp. 3502–3515, July 2023.
  23. “Advancing learned video compression with in-loop frame prediction,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 33, no. 5, pp. 2410–2423, May 2023.
  24. “Overview of the H.264/AVC video coding standard,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 13, no. 7, pp. 560–576, July 2003.
  25. “The latest open-source video codec VP9 - an overview and preliminary results,” in Proc. Picture Coding Symposium (PCS), Dec. 2013, pp. 390–393.
  26. “Optical flow and mode selection for learning-based video coding,” in Proc. IEEE Workshop on Multimedia Signal Processing, Sept. 2020, pp. 1–6.
  27. T. Ladune and P. Philippe, “Aivc: Artificial intelligence based video codec,” in Proc. IEEE International Conference on Image Processing (ICIP), Oct. 2022, pp. 316–320.
  28. “P-frame coding with generalized difference: A novel conditional coding approach,” in Proc. IEEE International Conference on Image Processing (ICIP), Oct. 2022, pp. 1266–1270.
  29. “Compact temporal trajectory representation for talking face video compression,” IEEE Transactions on Circuits and Systems for Video Technology, pp. 1–1, 2023.
  30. “CANF-VC: Conditional augmented normalizing flows for video compression,” in Proc. European Conference on Computer Vision (ECCV), 2022, pp. 207–223.
  31. Principles of Digital Communication and Coding, McGraw-Hill, 1979.
  32. “Spatial-channel context-based entropy modeling for end-to-end optimized image compression,” in Proc. International Conference on Visual Communications and Image Processing (VCIP), 2020, pp. 222–225.
  33. F. Mentzer, “clic2020-devkit,” https://github.com/fab-jul/clic2020-devkit, 2020, Last accessed: 15.11.2021 17:18.
  34. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in Proc. International Conference on Learning Representations (ICLR), May 2015, pp. 1–15.
  35. “Joint autoregressive and hierarchical priors for learned image compression,” in Proc. Advances in Neural Information Processing Systems (NeurIPS), Dec. 2018, vol. 31, pp. 1–10.
  36. “Compressai: a pytorch library and evaluation platform for end-to-end compression research,” arXiv preprint arXiv:2011.03029, pp. 1–19, Nov. 2020.
  37. G. Bjøntegaard, “Calculation of average PSNR differences between RD-curves, VCEG-M33,” 13th Meeting of the Video Coding Experts Group (VCEG), pp. 1–5, Jan. 2001.
  38. “The bjøntegaard bible – why your way of comparing video codecs may be wrong,” Arxiv Preprint, arxiv:2304.12852v1 (Accepted for Publication in IEEE Transactions on Image Processing), 2023.
  39. “Elic: Efficient learned image compression with unevenly grouped space-channel contextual adaptive coding,” in Proc. Conference on Computer Vision and Pattern Recognition (CVPR), June 2022, pp. 5718–5727.
  40. “Alphavc: High-performance and efficient learned video compression,” in Proc. European Conference on Computer Vision (ECCV), S. Avidan, G. Brostow, M. Cissé, G. M. Farinella, and T. Hassner, Eds., Cham, Oct. 2022, pp. 616–631, Springer Nature Switzerland.
Citations (3)

Summary

We haven't generated a summary for this paper yet.