Imitation Learning for Adaptive Video Streaming with Future Adversarial Information Bottleneck Principle (2405.03692v1)
Abstract: Adaptive video streaming plays a crucial role in ensuring high-quality video streaming services. Despite extensive research efforts devoted to Adaptive BitRate (ABR) techniques, the current reinforcement learning (RL)-based ABR algorithms may benefit the average Quality of Experience (QoE) but suffers from fluctuating performance in individual video sessions. In this paper, we present a novel approach that combines imitation learning with the information bottleneck technique, to learn from the complex offline optimal scenario rather than inefficient exploration. In particular, we leverage the deterministic offline bitrate optimization problem with the future throughput realization as the expert and formulate it as a mixed-integer non-linear programming (MINLP) problem. To enable large-scale training for improved performance, we propose an alternative optimization algorithm that efficiently solves the MINLP problem. To address the issues of overfitting due to the future information leakage in MINLP, we incorporate an adversarial information bottleneck framework. By compressing the video streaming state into a latent space, we retain only action-relevant information. Additionally, we introduce a future adversarial term to mitigate the influence of future information leakage, where Model Prediction Control (MPC) policy without any future information is employed as the adverse expert. Experimental results demonstrate the effectiveness of our proposed approach in significantly enhancing the quality of adaptive video streaming, providing a 7.30\% average QoE improvement and a 30.01\% average ranking reduction.
- Y. Sun, et al., “Cs2p: Improving video bitrate selection and adaptation with data-driven throughput prediction,” in Proc. Annu. Conf. ACM Spec. Interest Group Data Commun. Appl., Technol., Archit., Protoc. Comput. Commun. (SIGCOMM), 2016.
- K. Spiteri et al., “Bola: Near-optimal bitrate adaptation for online videos,” in IEEE Conference on Computer Communications (INFOCOM), 2016.
- X. Yin et al., “A control-theoretic approach for dynamic adaptive video streaming over http,” in Proc. Annu. Conf. ACM Spec. Interest Group Data Commun. Appl., Technol., Archit., Protoc. Comput. Commun. (SIGCOMM), 2015.
- H. Mao et al., “Neural adaptive video streaming with pensieve,” in Proc. Annu. Conf. ACM Spec. Interest Group Data Commun. Appl., Technol., Archit., Protoc. Comput. Commun. (SIGCOMM), 2017.
- S. Sengupta et al., “Hotdash: Hotspot aware adaptive video streaming using deep reinforcement learning,” in Proc. Int. Conf. Netw. Protoc. (ICNP), 2018.
- B. Alt, T. Ballard, R. Steinmetz, H. Koeppl, and A. Rizk, “Cba: Contextual quality adaptation for adaptive bitrate video streaming,” in IEEE Conference on Computer Communications (INFOCOM), 2019, pp. 1000–1008.
- T. Huang, R.-X. Zhang, and L. Sun, “Zwei: A self-play reinforcement learning framework for video transmission services,” IEEE Trans. Multimedia, vol. 24, pp. 1350–1365, 2022.
- Z. Meng et al., “Practically deploying heavyweight adaptive bitrate algorithms with teacher-student learning,” IEEE/ACM Trans. Netw., vol. 29, no. 2, pp. 723–736, 2021.
- T. Huang, C. Zhou, X. Yao, R.-X. Zhang, C. Wu, B. Yu, and L. Sun, “Quality-aware neural adaptive video streaming with lifelong imitation learning,” IEEE J. Sel. Areas Commun., vol. 38, no. 10, pp. 2324–2342, 2020.
- W. Li, J. Huang, S. Wang, C. Wu, S. Liu, and J. Wang, “An apprenticeship learning approach for adaptive video streaming based on chunk quality and user preference,” IEEE Trans. Multimedia, vol. 25, pp. 2488–2502, 2023.
- J. Liu, Z. Liu, J. Huang, W. Jiang, and J. Wang, “A buffer-based adaptive bitrate approach in wireless networks with iterative correction,” IEEE Wireless Commun. Lett., vol. 11, no. 8, pp. 1644–1648, 2022.
- B. Wang, M. Xu, F. Ren, C. Zhou, and J. Wu, “Cratus: A lightweight and robust approach for mobile live streaming,” IEEE Trans. Mobile Comput., vol. 21, no. 8, pp. 2761–2775, 2022.
- Z. Akhtar, et al., “Oboe: Auto-tuning video abr algorithms to network conditions,” in Proc. Annu. Conf. ACM Spec. Interest Group Data Commun. Appl., Technol., Archit., Protoc. Comput. Commun. (SIGCOMM), 2018, p. 44–58.
- G. Lv, Q. Wu, Q. Tan, W. Wang, Z. Li, and G. Xie, “Accurate throughput prediction for improving qoe in mobile adaptive streaming,” IEEE Trans. Mobile Comput., pp. 1–18, 2023.
- Y. Li, X. Zhang, C. Cui, S. Wang, and S. Ma, “Fleet: Improving quality of experience for low-latency live video streaming,” IEEE Trans Circuits Syst Video Technol, pp. 1–1, 2023.
- F. Y. Yan, H. Ayers, C. Zhu, S. Fouladi, J. Hong, K. Zhang, P. Levis, and K. Winstein, “Learning in situ: a randomized experiment in video streaming,” in Proc. USENIX Symp. Networked Syst. Des. Implement. (NSDI), 2020.
- J. Pei, C. An, A. Zhou, L. Liu, and H. Ma, “Par: Improving video bitrate adaptation via payload-aware throughput prediction,” in IEEE International Conference on Multimedia and Expo (ICME), 2022.
- N. Kan et al., “Uncertainty-aware robust adaptive video streaming with bayesian neural network and model predictive control,” in NOSSDAV - Proc. Workshop Netw. Oper. Syst. Support Digit. Audio Video, Part MMSys, 2021.
- J. Lin and S. Wang, “Improving robustness of learning-based adaptive video streaming in wildly fluctuating networks,” in IEEE International Conference on Multimedia and Expo (ICME), 2023, pp. 1787–1792.
- T. Huang, C. Zhou, R.-X. Zhang, C. Wu, and L. Sun, “Buffer awareness neural adaptive video streaming for avoiding extra buffer consumption,” in IEEE Conference on Computer Communications (INFOCOM), 2023.
- T. Huang et al., “Learning tailored adaptive bitrate algorithms to heterogeneous network conditions: A domain-specific priors and meta-reinforcement learning approach,” IEEE J. Sel. Areas Commun., 2022.
- N. Kan, Y. Jiang, C. Li, W. Dai, J. Zou, and H. Xiong, “Improving generalization for neural adaptive video streaming via meta reinforcement learning,” in MM - Proc. ACM Int. Conf. Multimed., 2022, p. 3006–3016.
- W. Li, X. Li, Y. Xu, Y. Yang, and S. Lu, “Metaabr: A meta-learning approach on adaptative bitrate selection for video streaming,” IEEE Trans. Mobile Comput., pp. 1–17, 2023.
- S. Ross and D. Bagnell, “Efficient reductions for imitation learning,” in Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS), 2010, pp. 661–668.
- S. Ross, G. Gordon, and D. Bagnell, “A reduction of imitation learning and structured prediction to no-regret online learning,” in Proceedings of the 14th international conference on artificial intelligence and statistics (AISTATS), 2011, pp. 627–635.
- F. Yang, A. Vereshchaka, Y. Zhou, C. Chen, and W. Dong, “Variational adversarial kernel learned imitation learning,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 04, pp. 6599–6606, Apr. 2020.
- D. Garg, S. Chakraborty, C. Cundy, J. Song, and S. Ermon, “Iq-learn: Inverse soft-q learning for imitation,” in Neural Information Processing Systems (NeurIPS), vol. 34, 2021, pp. 4028–4039.
- Y. Liu, Q. Liu, H. Zhao, Z. Pan, and C. Liu, “Adaptive quantitative trading: An imitative deep reinforcement learning approach,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 02, pp. 2128–2135, Apr. 2020.
- A. Singh, H. Liu, G. Zhou, A. Yu, N. Rhinehart, and S. Levine, “Parrot: Data-driven behavioral priors for reinforcement learning,” in International Conference on Learning Representations (ICLR), 2021.
- R. Huang, V. W. Wong, and R. Schober, “Rate-splitting for intelligent reflecting surface-aided multiuser vr streaming,” IEEE Journal on Selected Areas in Communications, vol. 41, no. 5, pp. 1516–1535, 2023.
- D. Wu, D. Zhang, M. Zhang, R. Zhang, F. Wang, and S. Cui, “Ilcas: Imitation learning-based configuration- adaptive streaming for live video analytics with cross-camera collaboration,” IEEE Trans. Mobile Comput., pp. 1–15, 2023.
- Z. Ning, H. Chen, E. C. H. Ngai, X. Wang, L. Guo, and J. Liu, “Lightweight imitation learning for real-time cooperative service migration,” IEEE Trans. Mobile Comput., pp. 1–18, 2023.
- L. Jia, C. Zhou, T. Huang, C. Li, and L. Sun, “Rdladder: Resolution-duration ladder for vbr-encoded videos via imitation learning,” in IEEE Conference on Computer Communications (INFOCOM), 2023.
- P. S. Chib and P. Singh, “Recent advancements in end-to-end autonomous driving using deep learning: A survey,” IEEE Transactions on Intelligent Vehicles, pp. 1–18, 2023.
- U. Siddique, P. Weng, and M. Zimmer, “Learning fair policies in multi-objective (Deep) reinforcement learning with average and discounted rewards,” in Proceedings of the 37th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, vol. 119. PMLR, 13–18 Jul 2020, pp. 8905–8915.
- J. Shao, Y. Mao, and J. Zhang, “Task-oriented communication for multidevice cooperative edge inference,” IEEE Trans. Wireless Commun., vol. 22, no. 1, pp. 73–87, 2023.
- A. Narayanan et al., “A variegated look at 5g in the wild: Performance, power, and qoe implications,” in Proc. Annu. Conf. ACM Spec. Interest Group Data Commun. Appl., Technol., Archit., Protoc. Comput. Commun. (SIGCOMM), 2021.
- H. Riiser et al., “Commute path bandwidth traces from 3g networks: analysis and applications,” in ACM Multimedia Systems Conference (MMSys), 2013.
- F. C. Commission, “Federal communications commission. 2016. raw data - measuring broadband america. (2016),” https://www.fcc.gov/reports-research/reports/.
- A. Narayanan et al., “Lumos5g: Mapping and predicting commercial mmwave 5g throughput,” in ACM Multimedia Systems Conference (MMSys), 2020.