Constrained Optimal Fuel Consumption of HEV: A Constrained Reinforcement Learning Approach (2403.07503v2)
Abstract: Hybrid electric vehicles (HEVs) are becoming increasingly popular because they can better combine the working characteristics of internal combustion engines and electric motors. However, the minimum fuel consumption of an HEV for a battery electrical balance case under a specific assembly condition and a specific speed curve still needs to be clarified in academia and industry. Regarding this problem, this work provides the mathematical expression of constrained optimal fuel consumption (COFC) from the perspective of constrained reinforcement learning (CRL) for the first time globally. Also, two mainstream approaches of CRL, constrained variational policy optimization (CVPO) and Lagrangian-based approaches, are utilized for the first time to obtain the vehicle's minimum fuel consumption under the battery electrical balance condition. We conduct case studies on the well-known Prius TOYOTA hybrid system (THS) under the NEDC condition; we give vital steps to implement CRL approaches and compare the performance between the CVPO and Lagrangian-based approaches. Our case study found that CVPO and Lagrangian-based approaches can obtain the lowest fuel consumption while maintaining the SOC balance constraint. The CVPO approach converges stable, but the Lagrangian-based approach can obtain the lowest fuel consumption at 3.95 L/100km, though with more significant oscillations. This result verifies the effectiveness of our proposed CRL approaches to the COFC problem.
- J. Rault, A. Richalet, J. Testud, and J. Papon, “Model predictive heuristic control: application to industrial processes,” Automatica, vol. 14, no. 5, pp. 413–428, 1978.
- S. Xie, X. Hu, S. Qi, X. Tang, K. Lang, Z. Xin, and J. Brighton, “Model predictive energy management for plug-in hybrid electric vehicles considering optimal battery depth of discharge,” Energy, vol. 173, pp. 667–678, 2019.
- Y. Wang, S. G. Advani, and A. K. Prasad, “A comparison of rule-based and model predictive controller-based power management strategies for fuel cell/battery hybrid vehicles considering degradation,” International Journal of Hydrogen Energy, vol. 45, no. 58, pp. 33 948–33 956, 2020.
- X. Lü, S. Li, X. He, C. Xie, S. He, Y. Xu, J. Fang, M. Zhang, and X. Yang, “Hybrid electric vehicles: A review of energy management strategies based on model predictive control,” Journal of Energy Storage, vol. 56, p. 106112, 2022.
- “Bmw applying predictive driving technology to increase fuel savings,” https://www.greencarcongress.com/2012/09/predictive-20120917.html, [Accessed 07-02-2024].
- “NETA - Home — netaauto.co,” https://www.netaauto.co, [Accessed 07-02-2024].
- “Galaxy l6-geely galaxy official website,” https://www.galaxy-geely.com/l6, [Accessed 07-02-2024].
- I. Ltd, “InterRegs: GB/T 19753-2021 — China — interregs.com,” https://www.interregs.com/catalogue/details/chn-1975321/gb-t-19753-2021/energy-consumption-of-light-duty-hybrid-electric-vehicles/, [Accessed 07-02-2024].
- J. H. Holland, “Genetic algorithms,” Scientific american, vol. 267, no. 1, pp. 66–73, 1992.
- V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller, “Playing atari with deep reinforcement learning,” arXiv preprint arXiv:1312.5602, 2013.
- T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, “Continuous control with deep reinforcement learning,” arXiv preprint arXiv:1509.02971, 2015.
- J. Xiong, Q. Wang, Z. Yang, P. Sun, L. Han, Y. Zheng, H. Fu, T. Zhang, J. Liu, and H. Liu, “Parametrized deep q-networks learning: Reinforcement learning with discrete-continuous hybrid action space,” arXiv preprint arXiv:1810.06394, 2018.
- R. Lian, J. Peng, Y. Wu, H. Tan, and H. Zhang, “Rule-interposing deep reinforcement learning based energy management strategy for power-split hybrid electric vehicle,” Energy, vol. 197, p. 117297, 2020.
- Z. E. Liu, Q. Zhou, Y. Li, S. Shuai, and H. Xu, “Safe deep reinforcement learning-based constrained optimal control scheme for hev energy management,” IEEE Transactions on Transportation Electrification, 2023.
- J. Wu, C. Huang, H. He, and H. Huang, “Confidence-aware reinforcement learning for energy management of electrified vehicles,” Renewable and Sustainable Energy Reviews, vol. 191, p. 114154, 2024.
- Z. E. Liu, Q. Zhou, Y. Li, and S. Shuai, “An intelligent energy management strategy for hybrid vehicle with irrational actions using twin delayed deep deterministic policy gradient,” IFAC-PapersOnLine, vol. 54, no. 10, pp. 546–551, 2021.
- J. Wu, Z. Wei, K. Liu, Z. Quan, and Y. Li, “Battery-involved energy management for hybrid electric bus based on expert-assistance deep deterministic policy gradient algorithm,” IEEE Transactions on Vehicular Technology, vol. 69, no. 11, pp. 12 786–12 796, 2020.
- Y. Chow, M. Ghavamzadeh, L. Janson, and M. Pavone, “Risk-constrained reinforcement learning with percentile risk criteria,” Journal of Machine Learning Research, vol. 18, no. 167, pp. 1–51, 2018.
- Q. Liang, F. Que, and E. Modiano, “Accelerated primal-dual policy optimization for safe reinforcement learning,” arXiv preprint arXiv:1802.06480, 2018.
- Y. Chow, O. Nachum, E. Duenez-Guzman, and M. Ghavamzadeh, “A lyapunov-based approach to safe reinforcement learning,” Advances in neural information processing systems, vol. 31, 2018.
- J. Achiam, D. Held, A. Tamar, and P. Abbeel, “Constrained policy optimization,” in International conference on machine learning. PMLR, 2017, pp. 22–31.
- Z. Liu, Z. Cen, V. Isenbaev, W. Liu, S. Wu, B. Li, and D. Zhao, “Constrained variational policy optimization for safe reinforcement learning,” in International Conference on Machine Learning. PMLR, 2022, pp. 13 644–13 668.
- T. Haarnoja, A. Zhou, P. Abbeel, and S. Levine, “Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor,” in International conference on machine learning. PMLR, 2018, pp. 1861–1870.
- Y. Zhang, Q. Vuong, and K. Ross, “First order constrained optimization in policy space,” Advances in Neural Information Processing Systems, vol. 33, pp. 15 338–15 349, 2020.
- J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” arXiv preprint arXiv:1707.06347, 2017.
- J. Schulman, S. Levine, P. Abbeel, M. Jordan, and P. Moritz, “Trust region policy optimization,” in International conference on machine learning. PMLR, 2015, pp. 1889–1897.
- A. Stooke, J. Achiam, and P. Abbeel, “Responsive safety in reinforcement learning by pid lagrangian methods,” in International Conference on Machine Learning. PMLR, 2020, pp. 9133–9143.
- C. W. Fox and S. J. Roberts, “A tutorial on variational bayesian inference,” Artificial intelligence review, vol. 38, pp. 85–95, 2012.
- A. Abdolmaleki, J. T. Springenberg, Y. Tassa, R. Munos, N. Heess, and M. Riedmiller, “Maximum a posteriori policy optimisation,” arXiv preprint arXiv:1806.06920, 2018.
- JetBrains, “Pycharm: The python ide for professional developers by jetbrains,” accessed Mar. 5, 2024. [Online]. Available: https://www.jetbrains.com/pycharm/
- PyPI, “Gym,” accessed Feb. 17, 2024. [Online]. Available: https://pypi.org/project/gym/
- J. Bezanson, A. Edelman, S. Karpinski, and V. B. Shah, “Julia: A fresh approach to numerical computing,” SIAM review, vol. 59, no. 1, pp. 65–98, 2017.
- S. YAN, “Crl data,” accessed Feb. 17, 2024. [Online]. Available: https://drive.google.com/drive/folders/1nmOsuL2OchGKLDvsVf43pUAFSkPEvZD-