Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 183 tok/s
Gemini 2.5 Pro 46 tok/s Pro
GPT-5 Medium 30 tok/s Pro
GPT-5 High 28 tok/s Pro
GPT-4o 82 tok/s Pro
Kimi K2 213 tok/s Pro
GPT OSS 120B 457 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

Accelerating Learnt Video Codecs with Gradient Decay and Layer-wise Distillation (2312.02605v1)

Published 5 Dec 2023 in eess.IV and cs.CV

Abstract: In recent years, end-to-end learnt video codecs have demonstrated their potential to compete with conventional coding algorithms in term of compression efficiency. However, most learning-based video compression models are associated with high computational complexity and latency, in particular at the decoder side, which limits their deployment in practical applications. In this paper, we present a novel model-agnostic pruning scheme based on gradient decay and adaptive layer-wise distillation. Gradient decay enhances parameter exploration during sparsification whilst preventing runaway sparsity and is superior to the standard Straight-Through Estimation. The adaptive layer-wise distillation regulates the sparse training in various stages based on the distortion of intermediate features. This stage-wise design efficiently updates parameters with minimal computational overhead. The proposed approach has been applied to three popular end-to-end learnt video codecs, FVC, DCVC, and DCVC-HEM. Results confirm that our method yields up to 65% reduction in MACs and 2x speed-up with less than 0.3dB drop in BD-PSNR. Supporting code and supplementary material can be downloaded from: https://jasminepp.github.io/lightweightdvc/

Definition Search Book Streamline Icon: https://streamlinehq.com
References (27)
  1. B. Bross, Y.-K. Wang, Y. Ye, S. Liu, J. Chen, G. J. Sullivan, and J.-R. Ohm, “Overview of the versatile video coding (VVC) standard and its applications,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 31, no. 10, pp. 3736–3764, 2021.
  2. D. Ma, F. Zhang, and D. R. Bull, “MFRNet: a new CNN architecture for post-processing and in-loop filtering,” IEEE Journal of Selected Topics in Signal Processing, vol. 15, no. 2, pp. 378–387, 2020.
  3. C. Feng, D. Danier, C. Tan, F. Zhang, and D. Bull, “ViSTRA3: Video coding with deep parameter adaptation and post processing,” in 2022 IEEE International Symposium on Circuits and Systems (ISCAS).   IEEE, 2022, pp. 824–828.
  4. D. Ma, M. Afonso, F. Zhang, and D. R. Bull, “Perceptually-inspired super-resolution of compressed videos,” in Applications of Digital Image Processing XLII, vol. 11137.   SPIE, 2019, pp. 310–318.
  5. G. Lu, W. Ouyang, D. Xu, X. Zhang, C. Cai, and Z. Gao, “DVC: An end-to-end deep video compression framework,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 11 006–11 015.
  6. Z. Hu, G. Lu, and D. Xu, “FVC: A new framework towards deep video compression in feature space,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 1502–1511.
  7. J. Li, B. Li, and Y. Lu, “Deep contextual video compression,” Advances in Neural Information Processing Systems, vol. 34, 2021.
  8. G. Gao, P. You, R. Pan, S. Han, Y. Zhang, Y. Dai, and H. Lee, “Neural image compression via attentional multi-scale back projection and frequency decomposition,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 14 677–14 686.
  9. J. Li, B. Li, and Y. Lu, “Hybrid spatial-temporal entropy modelling for neural video compression,” in Proceedings of the 30th ACM International Conference on Multimedia, 2022, pp. 1503–1511.
  10. ——, “Neural video compression with diverse contexts,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023, Vancouver, Canada, June 18-22, 2023, 2023.
  11. H. M. Kwan, G. Gao, F. Zhang, A. Gower, and D. Bull, “HiNeRV: Video Compression with Hierarchical Encoding based Neural Representation,” arXiv preprint arXiv:2306.09818, 2023.
  12. G.-H. Wang, J. Li, B. Li, and Y. Lu, “EVC: Towards real-time neural image compression with mask decay,” arXiv preprint arXiv:2302.05071, 2023.
  13. A. Luo, H. Sun, J. Liu, and J. Katto, “Memory-efficient learned image compression with pruned hyperprior module,” in 2022 IEEE International Conference on Image Processing (ICIP).   IEEE, 2022, pp. 3061–3065.
  14. S. Yin, C. Li, F. Meng, W. Tan, Y. Bao, Y. Liang, and W. Liu, “Exploring structural sparsity in neural image compression,” in 2022 IEEE International Conference on Image Processing (ICIP).   IEEE, 2022, pp. 471–475.
  15. H. Sun, L. Yu, and J. Katto, “Q-LIC: Quantizing learned image compression with channel splitting,” IEEE Transactions on Circuits and Systems for Video Technology, 2022.
  16. J.-H. Kim, J.-H. Choi, J. Chang, and J.-S. Lee, “Efficient deep learning-based lossy image compression via asymmetric autoencoder and pruning,” in ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).   IEEE, 2020, pp. 2063–2067.
  17. Z. Hu and D. Xu, “Complexity-guided slimmable decoder for efficient deep video compression,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 14 358–14 367.
  18. Z. Liu, L. Herranz, F. Yang, S. Zhang, S. Wan, M. Mrak, and M. G. Blanch, “Slimmable video codec,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 1743–1747.
  19. Y. Li, K. Adamczewski, W. Li, S. Gu, R. Timofte, and L. Van Gool, “Revisiting random channel pruning for neural network compression,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 191–201.
  20. A. I. Nowak, B. Grooten, D. C. Mocanu, and J. Tabor, “Fantastic weights and how to find them: Where to prune in dynamic sparse training,” arXiv preprint arXiv:2306.12230, 2023.
  21. Y. Bengio, N. Léonard, and A. Courville, “Estimating or propagating gradients through stochastic neurons for conditional computation,” arXiv preprint arXiv:1308.3432, 2013.
  22. I. Lazarevich, A. Kozlov, and N. Malinin, “Post-training deep neural network pruning via layer-wise calibration,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 798–805.
  23. T. Xue, B. Chen, J. Wu, D. Wei, and W. T. Freeman, “Video enhancement with task-oriented flow,” International Journal of Computer Vision, vol. 127, pp. 1106–1125, 2019.
  24. X. Sheng, J. Li, B. Li, L. Li, D. Liu, and Y. Lu, “Temporal context mining for learned video compression,” IEEE Transactions on Multimedia, 2022.
  25. F. Mentzer, G. Toderici, D. Minnen, S.-J. Hwang, S. Caelles, M. Lucic, and E. Agustsson, “VCT: A video compression transformer,” arXiv preprint arXiv:2206.07307, 2022.
  26. A. Mercat, M. Viitanen, and J. Vanne, “UVG Dataset: 50/120fps 4K Sequences for Video Codec Analysis and Development,” in MMSys.   ACM, 2020, pp. 297–302.
  27. H. Wang, W. Gan, S. Hu, J. Y. Lin, L. Jin, L. Song, P. Wang, I. Katsavounidis, A. Aaron, and C. J. Kuo, “MCL-JCV: A JND-based H.264/AVC video quality assessment dataset,” in ICIP.   IEEE, 2016, pp. 1509–1513.
Citations (3)

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Github Logo Streamline Icon: https://streamlinehq.com