Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Instance-Adaptive Video Compression: Improving Neural Codecs by Training on the Test Set (2111.10302v2)

Published 19 Nov 2021 in eess.IV, cs.CV, and cs.LG

Abstract: We introduce a video compression algorithm based on instance-adaptive learning. On each video sequence to be transmitted, we finetune a pretrained compression model. The optimal parameters are transmitted to the receiver along with the latent code. By entropy-coding the parameter updates under a suitable mixture model prior, we ensure that the network parameters can be encoded efficiently. This instance-adaptive compression algorithm is agnostic about the choice of base model and has the potential to improve any neural video codec. On UVG, HEVC, and Xiph datasets, our codec improves the performance of a scale-space flow model by between 21% and 27% BD-rate savings, and that of a state-of-the-art B-frame model by 17 to 20% BD-rate savings. We also demonstrate that instance-adaptive finetuning improves the robustness to domain shift. Finally, our approach reduces the capacity requirements of compression models. We show that it enables a competitive performance even after reducing the network size by 70%.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (76)
  1. Scale-space flow for end-to-end optimized video compression. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.  8503–8512, 2020.
  2. A compression objective and a cycle loss for neural image compression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2018.
  3. Variational image compression with a scale hyperprior. In International Conference on Learning Representations (ICLR), 2018.
  4. Estimating or propagating gradients through stochastic neurons for conditional computation. CoRR, abs/1308.3432, 2013. URL http://arxiv.org/abs/1308.3432.
  5. Gisle Bjøntegaard. Calculation of average PSNR differences between RD-curves (VCEG-M33). In VCEG Meeting (ITU-T SG16 Q. 6), pp.  2–4, 2001.
  6. G. Bradski. The OpenCV Library. Dr. Dobb’s Journal of Software Tools, 2000.
  7. Versatile video coding (draft 5). Joint Video Experts Team (JVET) of ITU-T SG, 16:3–12, 2018.
  8. Content adaptive optimization for neural image compression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, pp.  0–0, 2019.
  9. Array programming with NumPy. Nature, 585(7825):357–362, 2020. ISSN 1476-4687. doi: 10.1038/s41586-020-2649-2. URL https://doi.org/10.1038/s41586-020-2649-2.
  10. Nerv: Neural representations for videos, 2021.
  11. Learning for video compression. IEEE Transactions on Circuits and Systems for Video Technology, PP, 2019. doi: 10.1109/TCSVT.2019.2892608.
  12. Learning image and video compression through spatial-temporal energy compaction. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019.
  13. Deep frame prediction for video coding. IEEE Transactions on Circuits and Systems for Video Technology, 30(7):1843–1855, 2019.
  14. Neural inter-frame compression for video coding. In IEEE International Conference on Computer Vision (ICCV), pp.  6420–6428, 2019. doi: 10.1109/ICCV.2019.00652.
  15. COIN: COmpression with implicit neural representations. In Neural Compression: From Information Theory to Applications – Workshop @ ICLR 2021, 2021. URL https://openreview.net/forum?id=yekxhcsVi4.
  16. FFmpeg. FFmpeg. http://ffmpeg.org/.
  17. The state of sparsity in deep neural networks, 2019.
  18. Feedback recurrent autoencoder for video compression. In IEEE Asian Conference on Computer Vision (ACCV), 2020.
  19. Variable rate image compression with content adaptive optimization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, pp.  122–123, 2020.
  20. Video compression with rate-distortion autoencoders. In IEEE International Conference on Computer Vision (ICCV), pp.  7033–7042. openaccess.thecvf.com, 2019.
  21. A video compression framework using an overfitted restoration neural network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, pp.  148–149, 2020.
  22. An efficient video coding system with an adaptive overfitted multi-scale attention network. IEEE Access, 9:64022–64032, 2021.
  23. HEVC. HEVC test model (HM). https://vcgit.hhi.fraunhofer.de/jvet/HM.
  24. HEVC. Common test conditions and software reference configurations. http://phenix.it-sudparis.eu/jct/doc_end_user/current_document.php?id=7281, 2013.
  25. beta-VAE: Learning basic visual concepts with a constrained variational framework. International Conference on Learning Representations (ICLR), 2017.
  26. Improving deep video compression by resolution-adaptive flow coding. European Conference on Computer Vision (ECCV), 2020.
  27. FVC: A New Framework Towards Deep Video Compression in Feature Space. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.  1502–1511, June 2021.
  28. Coarse-to-fine deep video coding with hyperprior-guided mode prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  5921–5930, 2022.
  29. John D Hunter. Matplotlib: A 2d graphics environment. Computing in science & engineering, 9(3):90, 2007.
  30. Super slomo: High quality estimation of multiple intermediate frames for video interpolation. In Conference on Computer Vision and Pattern Recognition (CVPR), pp.  9000–9008, 2018.
  31. Statistical challenges of high-dimensional data. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 367:4237 – 4253, 2009.
  32. Auto-encoding variational bayes. International Conference on Learning Representations (ICLR), 2014.
  33. Utilising low complexity cnns to lift non-local redundancies in video coding. IEEE Transactions on Image Processing, 2020.
  34. Online-trained upsampler for deep low complexity video compression. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp.  7929–7938, 2021.
  35. Taxonomy and evaluation of structured compression of convolutional neural networks. arXiv preprint arXiv:1912.09802, 2019.
  36. Compressing weight-updates for image artifacts removal neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, pp.  0. Computer Vision Foundation / IEEE, 2019.
  37. Ffnerv: Flow-guided frame-wise neural representations for videos. arXiv preprint arXiv:, 2022.
  38. Group sparsity: The hinge between filter pruning and decomposition for network compression. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.  8018–8027, 2020a.
  39. Additive powers-of-two quantization: An efficient non-uniform discretization for neural networks. In International Conference on Learning Representations, 2020b. URL https://openreview.net/forum?id=BkgXT24tDS.
  40. M-LVC: Multiple Frames Prediction for Learned Video Compression. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun 2020. doi: 10.1109/cvpr42600.2020.00360. URL http://dx.doi.org/10.1109/CVPR42600.2020.00360.
  41. Learned video compression via joint spatial-temporal correlation exploration. In AAAI Conference on Artificial Intelligence, number 07, pp. 11580–11587, 2020.
  42. Overfitting the data: Compact neural video delivery via content-aware feature modulation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp.  4631–4640, 2021.
  43. Metapruning: Meta learning for automatic neural network channel pruning. In Proceedings of the IEEE/CVF international conference on computer vision, pp.  3296–3305, 2019.
  44. Deep generative video compression. In Advances in Neural Information Processing Systems (NeurIPS), pp.  9287–9298. Curran Associates, Inc., 2019.
  45. DVC: An end-to-end deep video compression framework. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.  11006–11015, June 2019.
  46. Content adaptive and error propagation aware deep video compression. European Conference on Computer Vision (ECCV), 2020.
  47. Wes McKinney. Data structures for statistical computing in python. In Stéfan van der Walt and Jarrod Millman (eds.), Python in Science Conference, pp.  51 – 56, 2010.
  48. UVG dataset: 50/120fps 4k sequences for video codec analysis and development. In ACM Multimedia Systems Conference, pp.  297–302, 2020. URL http://ultravideo.fi/#testsequences.
  49. An efficient image compression method based on neural network: An overfitting approach. In 2021 IEEE International Conference on Image Processing (ICIP), pp.  2084–2088, 2021. doi: 10.1109/ICIP42928.2021.9506367.
  50. A white paper on neural network quantization. arXiv preprint arXiv:2106.08295, 2021.
  51. Deep predictive video compression using mode-selective uni- and bi-directional predictions based on multi-frame hypothesis. IEEE Access, 9:72–85, 2021. doi: 10.1109/ACCESS.2020.3046040.
  52. Automatic differentiation in PyTorch. In NIPS Autodiff Workshop, 2017.
  53. End-to-end learning of video compression using spatio-temporal autoencoders. In 2020 IEEE Workshop on Signal Processing Systems (SiPS), pp.  1–6. IEEE, 2020.
  54. Extending neural P-frame codecs for B-frame coding. In IEEE International Conference on Computer Vision (ICCV), 2021.
  55. Python core team. Python: A dynamic, open source programming language. Python Software Foundation, 2019. URL https://www.python.org/.
  56. Learned video compression. In IEEE International Conference on Computer Vision (ICCV), pp.  3454–3463, 2019.
  57. ELF-VC: efficient learned flexible-rate video coding. CoRR, abs/2104.14335, 2021. URL https://arxiv.org/abs/2104.14335.
  58. Sandvine. 2019 Global Internet Phenomena Report. https://www.ncta.com/whats-new/report-where-does-the-majority-of-internet-traffic-come, 2019.
  59. SciPy contributors. SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nature Methods, 2020. doi: https://doi.org/10.1038/s41592-019-0686-2.
  60. Implicit neural representations for image compression, 2021.
  61. Overview of the high efficiency video coding (HEVC) standard. IEEE Transactions on Circuits and Systems for Video Technology, 22(12):1649–1668, 2012.
  62. Lossy image compression with compressive autoencoders. In International Conference on Learning Representations (ICLR), 2017.
  63. Overfitting for fun and profit: Instance-adaptive data compression. In International Conference on Learning Representations (ICLR), 2021.
  64. VideoLAN. x264 library. https://www.videolan.org/developers/x264.html, a.
  65. VideoLAN. x265 library. https://www.videolan.org/developers/x265.html, b.
  66. Ensemble Learning-Based Rate-Distortion optimization for End-to-End image compression. IEEE Transactions on Circuits and Systems for Video Technology, 31(3):1193–1207, March 2021.
  67. Michael L. Waskom. seaborn: statistical data visualization. Journal of Open Source Software, 6(60):3021, 2021. doi: 10.21105/joss.03021. URL https://doi.org/10.21105/joss.03021.
  68. Overview of the H.264/AVC video coding standard. IEEE Transactions on Circuits and Systems for Video Technology, 13(7):560–576, 2003.
  69. Video compression through image interpolation. In European Conference on Computer Vision (ECCV), pp. 416–431, 2018.
  70. Xiph.org. Video test media. https://media.xiph.org/video/derf/.
  71. Video enhancement with task-oriented flow. International Journal of Computer Vision (IJCV), 127(8):1106–1125, 2019. URL http://toflow.csail.mit.edu/.
  72. Learning for video compression with hierarchical quality and recurrent enhancement. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020a.
  73. Learning for video compression with recurrent auto-encoder and recurrent probability model. IEEE Journal of Selected Topics in Signal Processing, 15(2):388–401, 2020b.
  74. Improving inference for neural image compression. Advances in Neural Information Processing Systems (NeurIPS), 33, 2020c.
  75. Implicit neural video compression. In International Conference on Learning Representations (ICLR) Workshops, 2022.
  76. L22{}^{2}start_FLOATSUPERSCRIPT 2 end_FLOATSUPERSCRIPTC – Learning to learn to compress. In IEEE International Workshop on Multimedia Signal Processing (MMSP), pp.  1–6. IEEE, 2020.
Citations (23)

Summary

We haven't generated a summary for this paper yet.