Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Conditional Neural Video Coding with Spatial-Temporal Super-Resolution (2401.13959v1)

Published 25 Jan 2024 in eess.IV and cs.CV

Abstract: This document is an expanded version of a one-page abstract originally presented at the 2024 Data Compression Conference. It describes our proposed method for the video track of the Challenge on Learned Image Compression (CLIC) 2024. Our scheme follows the typical hybrid coding framework with some novel techniques. Firstly, we adopt Spynet network to produce accurate motion vectors for motion estimation. Secondly, we introduce the context mining scheme with conditional frame coding to fully exploit the spatial-temporal information. As for the low target bitrates given by CLIC, we integrate spatial-temporal super-resolution modules to improve rate-distortion performance. Our team name is IMCLVC.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (15)
  1. “Dvc: An end-to-end deep video compression framework,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 11006–11015.
  2. “Scale-space flow for end-to-end optimized video compression,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 8503–8512.
  3. “Fvc: A new framework towards deep video compression in feature space,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 1502–1511.
  4. “Elf-vc: Efficient learned flexible-rate video coding,” arXiv preprint arXiv:2104.14335, 2021.
  5. “End-to-end rate-distortion optimized learned hierarchical bi-directional video compression,” IEEE Transactions on Image Processing, vol. 31, pp. 974–983, 2021.
  6. “Coarse-to-fine deep video coding with hyperprior-guided mode prediction,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 5921–5930.
  7. “Learning cross-scale weighted prediction for efficient neural video compression,” IEEE Transactions on Image Processing, 2023.
  8. “Deep contextual video compression,” Advances in Neural Information Processing Systems, vol. 34, pp. 18114–18125, 2021.
  9. “Temporal context mining for learned video compression,” IEEE Transactions on Multimedia, 2022.
  10. “Canf-vc: Conditional augmented normalizing flows for video compression,” in European Conference on Computer Vision. Springer, 2022, pp. 207–223.
  11. “Motion information propagation for neural video compression,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 6111–6120.
  12. “Neural video compression with diverse contexts,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 22616–22626.
  13. “Extracting motion and appearance via inter-frame attention for efficient video frame interpolation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 5682–5692.
  14. “Optical flow estimation using a spatial pyramid network,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 4161–4170.
  15. “Elic: Efficient learned image compression with unevenly grouped space-channel contextual adaptive coding,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 5718–5727.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Henan Wang (4 papers)
  2. Xiaohan Pan (5 papers)
  3. Runsen Feng (15 papers)
  4. Zongyu Guo (19 papers)
  5. Zhibo Chen (176 papers)

Summary

We haven't generated a summary for this paper yet.