2000 character limit reached
Hierarchical Autoregressive Modeling for Neural Video Compression (2010.10258v3)
Published 19 Oct 2020 in eess.IV and cs.LG
Abstract: Recent work by Marino et al. (2020) showed improved performance in sequential density estimation by combining masked autoregressive flows with hierarchical latent variable models. We draw a connection between such autoregressive generative models and the task of lossy video compression. Specifically, we view recent neural video compression methods (Lu et al., 2019; Yang et al., 2020b; Agustssonet al., 2020) as instances of a generalized stochastic temporal autoregressive transform, and propose avenues for enhancement based on this insight. Comprehensive evaluations on large-scale video data show improved rate-distortion performance over both state-of-the-art neural and conventional video compression methods.
- Scale-space flow for end-to-end optimized video compression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8503–8512, 2020.
- Stochastic variational video prediction. In International Conference on Learning Representations, 2018.
- End-to-end optimized image compression. In 5th International Conference on Learning Representations, ICLR 2017, 2017.
- Variational image compression with a scale hyperprior. International Conference on Learning Representations, 2018.
- Fabrice Bellard. Bpg image format, 2014. URL https://bellard.org/bpg/bpg_spec.txt.
- Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432, 2013.
- Quo vadis, action recognition? a new model and the kinetics dataset. In proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6299–6308, 2017.
- Deepcoder: A deep neural network based video compression. In 2017 IEEE Visual Communications and Image Processing (VCIP), pp. 1–4, 2017.
- Learning for video compression. IEEE Transactions on Circuits and Systems for Video Technology, 30(2):566–576, 2019.
- A recurrent latent variable model for sequential data. In Advances in neural information processing systems, pp. 2980–2988, 2015.
- Cassius C Cutler. Differential quantization of communication signals, July 29 1952. US Patent 2,605,361.
- Stochastic video generation with a learned prior. In International Conference on Machine Learning, pp. 1174–1183. PMLR, 2018.
- Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516, 2014.
- Density estimation using real nvp. arXiv preprint arXiv:1605.08803, 2016.
- Neural inter-frame compression for video coding. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 6420–6428, 2019.
- Flownet: Learning optical flow with convolutional networks. In Proceedings of the IEEE international conference on computer vision, pp. 2758–2766, 2015.
- Compression without quantization. In OpenReview, 2019.
- Feedback recurrent autoencoder for video compression. arXiv preprint arXiv:2004.04342, 2020.
- Video compression with rate-distortion autoencoders. In Proceedings of the IEEE International Conference on Computer Vision, pp. 7033–7042, 2019.
- Deep generative video compression. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett (eds.), Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019.
- Improved lossy image compression with priming and spatially adaptive bit rates for recurrent networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4385–4393, 2018.
- Adam: A method for stochastic optimization. In International Conference on Learning Representations, 2015.
- Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
- Glow: Generative flow with invertible 1x1 convolutions. In Advances in Neural Information Processing Systems, pp. 10215–10224, 2018.
- Improved variational inference with inverse autoregressive flow. In Advances in neural information processing systems, pp. 4743–4751, 2016.
- Stochastic adversarial video prediction. arXiv preprint arXiv:1804.01523, 2018.
- Disentangled sequential autoencoder. In Jennifer Dy and Andreas Krause (eds.), Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pp. 5670–5679, Stockholmsmässan, Stockholm Sweden, 10–15 Jul 2018. PMLR.
- Learned video compression via joint spatial-temporal correlation exploration. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pp. 11580–11587, 2020.
- Dvc: An end-to-end deep video compression framework. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 11006–11015, 2019.
- Improving sequential latent variable models with autoregressive flows. In Symposium on Advances in Approximate Bayesian Inference, pp. 1–16, 2020.
- Uvg dataset: 50/120fps 4k sequences for video codec analysis and development. In Proceedings of the 11th ACM Multimedia Systems Conference, pp. 297–302, 2020.
- Channel-wise autoregressive entropy models for learned image compression. In 2020 IEEE International Conference on Image Processing (ICIP), pp. 3339–3343. IEEE, 2020.
- Joint autoregressive and hierarchical priors for learned image compression. In Advances in Neural Information Processing Systems, pp. 10771–10780, 2018.
- Masked autoregressive flow for density estimation. In Advances in Neural Information Processing Systems, pp. 2338–2347, 2017.
- Variational inference with normalizing flows. In Proceedings of the 32nd International Conference on International Conference on Machine Learning-Volume 37, pp. 1530–1538, 2015.
- Learned video compression. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 3453–3462, 2019.
- Deep state space models for unconditional word generation. In Advances in Neural Information Processing Systems, pp. 6158–6168, 2018.
- Autoregressive text generation beyond feedback loops. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 3391–3397, 2019.
- Overview of the high efficiency video coding (hevc) standard. IEEE Transactions on circuits and systems for video technology, 22(12):1649–1668, 2012.
- Lossy image compression with compressive autoencoders. International Conference on Learning Representations, 2017.
- Full resolution image compression with recurrent neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017.
- Generating videos with scene dynamics. In Advances in neural information processing systems, pp. 613–621, 2016.
- Mcl-jcv: a jnd-based h. 264/avc video quality assessment dataset. In 2016 IEEE International Conference on Image Processing (ICIP), pp. 1509–1513. IEEE, 2016.
- Multiscale structural similarity for image quality assessment. In The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003, volume 2, pp. 1398–1402. Ieee, 2003.
- Overview of the h. 264/avc video coding standard. IEEE Transactions on circuits and systems for video technology, 13(7):560–576, 2003.
- Video compression through image interpolation. In Proceedings of the European Conference on Computer Vision (ECCV), pp. 416–431, 2018.
- Video enhancement with task-oriented flow. International Journal of Computer Vision (IJCV), 127(8):1106–1125, 2019.
- Learning for video compression with hierarchical quality and recurrent enhancement. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020a.
- Deep generative video compression with temporal autoregressive transforms. ICML 2020 Workshop on Invertible Neural Networks, Normalizing Flows, and Explicit Likelihood Models, 2020b.
- Feedback recurrent autoencoder. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 3347–3351. IEEE, 2020c.
- Improving inference for neural image compression. Advances in Neural Information Processing Systems, 33, 2020d.
- Variational bayesian quantization. In International Conference on Machine Learning, 2020e.
- Ruihan Yang (43 papers)
- Yibo Yang (80 papers)
- Joseph Marino (19 papers)
- Stephan Mandt (100 papers)