COIN: Chance-Constrained Imitation Learning for Uncertainty-aware Adaptive Resource Oversubscription Policy (2401.07051v1)
Abstract: We address the challenge of learning safe and robust decision policies in presence of uncertainty in context of the real scientific problem of adaptive resource oversubscription to enhance resource efficiency while ensuring safety against resource congestion risk. Traditional supervised prediction or forecasting models are ineffective in learning adaptive policies whereas standard online optimization or reinforcement learning is difficult to deploy on real systems. Offline methods such as imitation learning (IL) are ideal since we can directly leverage historical resource usage telemetry. But, the underlying aleatoric uncertainty in such telemetry is a critical bottleneck. We solve this with our proposed novel chance-constrained imitation learning framework, which ensures implicit safety against uncertainty in a principled manner via a combination of stochastic (chance) constraints on resource congestion risk and ensemble value functions. This leads to substantial ($\approx 3-4\times$) improvement in resource efficiency and safety in many oversubscription scenarios, including resource management in cloud services.
- A review of uncertainty quantification in deep learning: Techniques, applications and challenges. Information Fusion, 76: 243–297.
- Constrained policy optimization. In International conference on machine learning, 22–31. PMLR.
- Passenger demand forecasting in scheduled transportation. European Journal of Operational Research, 286(3): 797–810.
- Towards an understanding of oversubscription in cloud. In 2nd USENIX Workshop on Hot Topics in Management of Internet, Cloud, and Enterprise Networks and Services (Hot-ICE 12).
- Improving consolidation of virtual machines with risk-aware bandwidth oversubscription in compute clouds. In 2012 Proceedings IEEE INFOCOM, 2861–2865. IEEE.
- Improving cluster resource efficiency with oversubscription. In 2018 IEEE 42nd Annual Computer Software and Applications Conference (COMPSAC), volume 1, 144–153. IEEE.
- Overcommitment in cloud services: Bin packing with chance constraints. Management Science, 65(7): 3255–3271.
- Uncertainty-Aware Data Aggregation for Deep Imitation Learning. In 2019 International Conference on Robotics and Automation (ICRA), 761–767.
- Safe exploration in continuous action spaces. arXiv preprint arXiv:1801.08757.
- Causal confusion in imitation learning. Advances in Neural Information Processing Systems, 32.
- Differentiable Constrained Imitation Learning for Robot Motion Planning and Control. arXiv preprint arXiv:2210.11796.
- Model-based imitation learning by probabilistic trajectory matching. In 2013 IEEE International Conference on Robotics and Automation, 1922–1927.
- A comprehensive survey on safe reinforcement learning. Journal of Machine Learning Research, 16(1): 1437–1480.
- Risk-sensitive reinforcement learning applied to control under constraints. Journal of Artificial Intelligence Research, 24: 81–108.
- Airline seat allocation among multiple fare classes with overbooking. IIE Transactions, 34(9): 729–742.
- Protean: VM Allocation Service at Scale. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20), 845–861. USENIX Association. ISBN 978-1-939133-19-9.
- On cloud-based oversubscription. arXiv preprint arXiv:1402.4758.
- Aleatoric and epistemic uncertainty in machine learning: An introduction to concepts and methods. Machine Learning, 110: 457–506.
- Imitation Learning With Additional Constraints on Motion Style Using Parametric Bias. IEEE Robotics and Automation Letters, 6(3): 5897–5904.
- {{\{{Prediction-Based}}\}} Power Oversubscription in Cloud Platforms. In 2021 USENIX Annual Technical Conference (USENIX ATC 21), 473–487.
- A bounded actor–critic reinforcement learning algorithm applied to airline revenue management. Engineering Applications of Artificial Intelligence, 82: 252–262.
- Li, Z. 2019. An adaptive overload threshold selection process using Markov decision processes of virtual machine in cloud data center. Cluster Computing, 22(2): 3821–3833.
- Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971.
- A Review of Uncertainty for Deep Reinforcement Learning. In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment.
- Reinforcement learning for combinatorial optimization: A survey. Computers & Operations Research, 134: 105400.
- EnsembleDAgger: A Bayesian Approach to Safe Imitation Learning. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 5041–5048.
- Proactive handling of flight overbooking: how to reduce negative eWOM and the costs of bumping customers. Journal of Service Research, 24(2): 206–225.
- Learning safe policies via primal-dual methods. In 2019 IEEE 58th Conference on Decision and Control (CDC), 6491–6497. IEEE.
- Constrained reinforcement learning has zero duality gap. Advances in Neural Information Processing Systems, 32.
- Model-Based Chance-Constrained Reinforcement Learning via Separated Proportional-Integral Lagrangian. IEEE Transactions on Neural Networks and Learning Systems.
- Model-based actor-critic with chance constraint for stochastic system. In 2021 60th IEEE Conference on Decision and Control (CDC), 4694–4700. IEEE.
- Pomerleau, D. A. 1991. Efficient training of artificial neural networks for autonomous navigation. Neural computation.
- Reinforcement learning for resource provisioning in the vehicular cloud. IEEE Wireless Communications, 23(4): 128–135.
- Constrained markov decision processes via backward value functions. In International Conference on Machine Learning, 8502–8511. PMLR.
- Serverless in the Wild: Characterizing and Optimizing the Serverless Workload at a Large Cloud Provider. In 2020 USENIX Annual Technical Conference (USENIX ATC 20), 205–218. USENIX Association. ISBN 978-1-939133-14-4.
- Learning Cooperative Oversubscription for Cloud by Chance-Constrained Multi-Agent Reinforcement Learning. arXiv preprint arXiv:2211.11759.
- Autonomous airline revenue management: A deep reinforcement learning approach to seat inventory control and overbooking. arXiv preprint arXiv:1902.06824.
- An Airline Overbooking Policy. Transportation Science, 9(2): 101–114.
- Uncertainty-Aware Imitation Learning using Kernelized Movement Primitives. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
- Air passenger demand forecasting and passenger terminal capacity expansion: A system dynamics framework. Expert Systems with Applications, 37(3): 2324–2339.
- Reinforcement learning: An introduction. MIT press.
- Suzuki, Y. 2006. The net benefit of airline overbooking. Transportation Research Part E: Logistics and Transportation Review, 42(1): 1–19.
- Energy-aware dynamic virtual machine consolidation for cloud datacenters. IEEE Access, 6: 15259–15273.