Interactive $360^{\circ}$ Video Streaming Using FoV-Adaptive Coding with Temporal Prediction (2403.11155v1)
Abstract: For $360{\circ}$ video streaming, FoV-adaptive coding that allocates more bits for the predicted user's field of view (FoV) is an effective way to maximize the rendered video quality under the limited bandwidth. We develop a low-latency FoV-adaptive coding and streaming system for interactive applications that is robust to bandwidth variations and FoV prediction errors. To minimize the end-to-end delay and yet maximize the coding efficiency, we propose a frame-level FoV-adaptive inter-coding structure. In each frame, regions that are in or near the predicted FoV are coded using temporal and spatial prediction, while a small rotating region is coded with spatial prediction only. This rotating intra region periodically refreshes the entire frame, thereby providing robustness to both FoV prediction errors and frame losses due to transmission errors. The system adapts the sizes and rates of different regions for each video segment to maximize the rendered video quality under the predicted bandwidth constraint. Integrating such frame-level FoV adaptation with temporal prediction is challenging due to the temporal variations of the FoV. We propose novel ways for modeling the influence of FoV dynamics on the quality-rate performance of temporal predictive coding.We further develop LSTM-based machine learning models to predict the user's FoV and network bandwidth.The proposed system is compared with three benchmark systems, using real-world network bandwidth traces and FoV traces, and is shown to significantly improve the rendered video quality, while achieving very low end-to-end delay and low frame-freeze probability.
- Viewport-driven rate-distortion optimized scalable live 360° video network multicast. In 2018 IEEE International Conference on Multimedia & Expo Workshops (ICMEW). 1–6.
- Cub360: Exploiting cross-users behaviors for viewport prediction in 360 video adaptive streaming. In 2018 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 1–6.
- Shooting a moving target: Motion-prediction-based transmission for 360-degree videos. In 2016 IEEE International Conference on Big Data (Big Data). IEEE, 1161–1170.
- Data-Driven Bandwidth Prediction Models and Automated Model Selection for Low Latency. IEEE Transactions on Multimedia (2020), 1–1. https://doi.org/10.1109/TMM.2020.3013387
- JVET common test conditions and evaluation procedures for 360 video. Joint Video Exploration Team of ITU-T SG 16 (2017).
- Sparkle: User-Aware Viewport Prediction in 360-degree Video Streaming. IEEE Transactions on Multimedia (2020), 1–1. https://doi.org/10.1109/TMM.2020.3033127
- Optimal set of 360-degree videos for viewport-adaptive streaming. In Proceedings of the 25th ACM international conference on Multimedia. 943–951.
- Viewport-adaptive navigable 360-degree video delivery. In 2017 IEEE international conference on communications (ICC). IEEE, 1–7.
- Peter L Dordal. 2008. An Introduction to Computer Networks. http://intronetworks.cs.luc.edu/1/html/packets.html.
- Mobile Streaming of Live 360-Degree Videos. IEEE Transactions on Multimedia 22, 12 (2020), 3139–3152. https://doi.org/10.1109/TMM.2020.2973855
- Fixation prediction for 360 video streaming in head-mounted virtual reality. In Proceedings of the 27th Workshop on Netwbibliographyork and Operating Systems Support for Digital Audio and Video. ACM, 67–72.
- Optimizing Fixation Prediction Using Recurrent Neural Networks for 360∘{}^{\circ}start_FLOATSUPERSCRIPT ∘ end_FLOATSUPERSCRIPT Video Streaming in Head-Mounted Virtual Reality. IEEE Transactions on Multimedia 22, 3 (2020), 744–759. https://doi.org/10.1109/TMM.2019.2931807
- Tiling in Interactive Panoramic Video: Approaches and Evaluation. IEEE Transactions on Multimedia 18, 9 (2016), 1819–1831. https://doi.org/10.1109/TMM.2016.2586304
- Efficient live and on-demand tiled hevc 360 vr video streaming. In 2018 IEEE International Symposium on Multimedia (ISM). IEEE, 81–88.
- Predictive adaptive streaming to enable mobile 360-degree and VR experiences. IEEE Transactions on Multimedia 23 (2020), 716–731.
- Huawei. 2016. Whitepaper on the VR-Oriented Bearer Network Requirement (2016). https://www-file.huawei.com/~/media/CORPORATE/PDF/white%20paper/whitepaper-on-the-vr-oriented-bearer-network-requirement-en.pdf.
- Improving fairness, efficiency, and stability in http-based adaptive video streaming with festive. In Proceedings of the 8th international conference on Emerging networking experiments and technologies. 97–108.
- High efficiency video coding (HEVC) test model 14 (HM 14) encoder description. Document: JCTVC-P1002. JCT-VC, Jan (2014).
- Real-time bandwidth prediction and rate adaptation for video calls over cellular networks. In Proceedings of the 7th International Conference on Multimedia Systems. 1–11.
- PERCEIVE: deep learning-based cellular uplink prediction using real-time scheduling patterns. In Proceedings of the 18th International Conference on Mobile Systems, Applications, and Services. 377–390.
- Outatime: Using speculation to enable low-latency continuous interaction for mobile cloud gaming. In Proceedings of the 13th Annual International Conference on Mobile Systems, Applications, and Services. 151–165.
- Very Long Term Field of View Prediction for 360-degree Video Streaming. In 2019 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR). IEEE, 297–302.
- LIME: understanding commercial 360° live video streaming services. In Proceedings of the 10th ACM Multimedia Systems Conference. 154–164.
- Optimal Wireless Streaming of Multi-Quality 360 VR Video by Exploiting Natural, Relative Smoothness-enabled and Transcoding-enabled Multicast Opportunities. IEEE Transactions on Multimedia (2020), 1–1. https://doi.org/10.1109/TMM.2020.3029880
- Tile-Based Joint Caching and Delivery of 360° Videos in Heterogeneous Networks. IEEE Transactions on Multimedia 22, 9 (2020), 2382–2395. https://doi.org/10.1109/TMM.2019.2957993
- Low-latency FoV-adaptive Coding and Streaming for Interactive 360° Video Streaming. In Proceedings of the 28th ACM International Conference on Multimedia. 3696–3704.
- Realtime mobile bandwidth prediction using LSTM neural network and Bayesian fusion. Computer Networks 182 (2020), 107515.
- An optimal tile-based approach for viewport-adaptive 360-degree video streaming. IEEE Journal on Emerging and Selected Topics in Circuits and Systems 9, 1 (2019), 29–42.
- Remote VR Gaming on Mobile Devices. In Proceedings of the 27th ACM International Conference on Multimedia. 2191–2193.
- Optimizing 360 video delivery over cellular networks. In Proceedings of the 5th Workshop on All Things Cellular: Operations, Applications and Challenges. 1–6.
- A Fast FoV-Switching DASH System Based on Tiling Mechanism for Practical Omnidirectional Video Services. IEEE Transactions on Multimedia 22, 9 (2020), 2366–2381. https://doi.org/10.1109/TMM.2019.2957976
- A two-tier system for on-demand streaming of 360 degree video over dynamic networks. IEEE Journal on Emerging and Selected Topics in Circuits and Systems 9, 1 (2019), 43–57.
- Flocking-based live streaming of 360-degree video. In Proceedings of the 11th ACM Multimedia Systems Conference. 26–37.
- Live 360 Degree Video Delivery based on User Collaboration in a Streaming Flock. IEEE Transactions on Multimedia (2022).
- CS2P: Improving video bitrate selection and adaptation with data-driven throughput prediction. In Proceedings of the 2016 ACM SIGCOMM Conference. 272–285.
- Low latency edge rendering scheme for interactive 360 degree virtual reality gaming. In 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS). IEEE, 1557–1560.
- A dataset for exploring user behaviors in VR spherical video streaming. In Proceedings of the 8th ACM on Multimedia Systems Conference. ACM, 193–198.
- A dataset for exploring user behaviors in VR spherical video streaming. In Proceedings of the 8th ACM on Multimedia Systems Conference. 193–198.
- Optile: Toward optimal tiling in 360-degree video streaming. In Proceedings of the 25th ACM international conference on Multimedia. ACM, 708–716.
- Gaze prediction in dynamic 360 immersive videos. In proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5333–5342.
- LinkForecast: cellular link bandwidth prediction in LTE networks. IEEE Transactions on Mobile Computing 17, 7 (2017), 1582–1594.
- Exploring Viewer Gazing Patterns for Touch-Based Mobile Gamecasting. IEEE Transactions on Multimedia 19, 10 (2017), 2333–2344. https://doi.org/10.1109/TMM.2017.2743987
- VR video conferencing over named data networks. In Proceedings of the Workshop on Virtual Reality and Augmented Reality Network. 7–12.