Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Improving a Proportional Integral Controller with Reinforcement Learning on a Throttle Valve Benchmark (2402.13654v2)

Published 21 Feb 2024 in eess.SY, cs.LG, and cs.SY

Abstract: This paper presents a learning-based control strategy for non-linear throttle valves with an asymmetric hysteresis, leading to a near-optimal controller without requiring any prior knowledge about the environment. We start with a carefully tuned Proportional Integrator (PI) controller and exploit the recent advances in Reinforcement Learning (RL) with Guides to improve the closed-loop behavior by learning from the additional interactions with the valve. We test the proposed control method in various scenarios on three different valves, all highlighting the benefits of combining both PI and RL frameworks to improve control performance in non-linear stochastic systems. In all the experimental test cases, the resulting agent has a better sample efficiency than traditional RL agents and outperforms the PI controller.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (31)
  1. C. Canudas de Wit, I. Kolmanovsky, and J. Sun, “Adaptive pulse control of electronic throttle,” in Proceedings of the 2001 American Control Conference, vol. 4.   IEEE, 2001, pp. 2872–2877.
  2. U. Ozguner, S. Hong, and Y. Pan, “Discrete-time sliding mode control of electronic throttle valve,” in Proceedings of the 40th IEEE Conference on Decision and Control, vol. 2.   IEEE, 2001, pp. 1819–1824.
  3. J. Deur, D. Pavkovic, N. Peric, M. Jansz, and D. Hrovat, “An electronic throttle control strategy including compensation of friction and limp-home effects,” IEEE Transactions on Industry Applications, vol. 40, no. 3, pp. 821–834, 2004.
  4. D. Pavković, J. Deur, M. Jansz, and N. Perić, “Adaptive control of automotive electronic throttle,” Control Engineering Practice, vol. 14, no. 2, 2006.
  5. X. Jiao, J. Zhang, and T. Shen, “An adaptive servo control strategy for automotive electronic throttle and experimental validation,” IEEE Transactions on Industrial Electronics, vol. 61, no. 11, pp. 6275–6284, 2014.
  6. E. Witrant, I. D. Landau, and M.-P. Vaillant, “A data-driven control methodology applied to throttle valves,” Control Engineering Practice, vol. 139, p. 105634, 2023.
  7. S. Zhang, J. J. Yang, and G. G. Zhu, “LPV modeling and mixed constrained H2/H∞subscript𝐻2subscript𝐻H_{2}/H_{\infty}italic_H start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT / italic_H start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT control of an electronic throttle,” IEEE/ASME Transactions on Mechatronics, vol. 20, no. 5, pp. 2120–2132, 2014.
  8. M. Barić, I. Petrović, and N. Perić, “Neural network-based sliding mode control of electronic throttle,” Engineering Applications of Artificial Intelligence, vol. 18, no. 8, pp. 951–961, 2005.
  9. R. Siraskar, “Reinforcement learning for control of valves,” Machine Learning with Applications, vol. 4, p. 100030, 2021.
  10. T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, “Continuous control with deep reinforcement learning,” in Proceedings of the 4th International Conference on Learning Representations (ICLR), 2016.
  11. K. Arulkumaran, M. P. Deisenroth, M. Brundage, and A. A. Bharath, “Deep reinforcement learning: A brief survey,” IEEE Signal Processing Magazine, vol. 34, no. 6, pp. 26–38, 2017.
  12. M. G. Bellemare, S. Candido, P. S. Castro, J. Gong, M. C. Machado, S. Moitra, S. S. Ponda, and Z. Wang, “Autonomous navigation of stratospheric balloons using reinforcement learning,” Nature, vol. 588, no. 7836, pp. 77–82, 2020.
  13. J. Degrave, F. Felici, J. Buchli, M. Neunert, B. Tracey, F. Carpanese, T. Ewalds, R. Hafner, A. Abdolmaleki, D. de Las Casas et al., “Magnetic control of tokamak plasmas through deep reinforcement learning,” Nature, vol. 602, no. 7897, pp. 414–419, 2022.
  14. W. Koch, R. Mancuso, R. West, and A. Bestavros, “Reinforcement learning for uav attitude control,” ACM Transactions on Cyber-Physical Systems, vol. 3, no. 2, pp. 1–21, 2019.
  15. S. Tunyasuvunakool, A. Muldal, Y. Doron, S. Liu, S. Bohez, J. Merel, T. Erez, T. Lillicrap, N. Heess, and Y. Tassa, “dm_control: Software and tasks for continuous control,” Software Impacts, vol. 6, p. 100022, 2020.
  16. G. Dulac-Arnold, N. Levine, D. J. Mankowitz, J. Li, C. Paduraru, S. Gowal, and T. Hester, “Challenges of real-world reinforcement learning: definitions, benchmarks and analysis,” Mach. Learn., vol. 110, no. 9, pp. 2419–2468, 2021.
  17. B. Recht, “A tour of reinforcement learning: The view from continuous control,” Annual Review of Control, Robotics, and Autonomous Systems, vol. 2, pp. 253–279, 2019.
  18. S. Gillen, M. Molnar, and K. Byl, “Combining deep reinforcement learning and local control for the acrobot swing-up and balance task,” in Proceedings of the 2020 59th IEEE Conference on Decision and Control (CDC).   IEEE, 2020, pp. 4129–4134.
  19. S. Zoboli, V. Andrieu, D. Astolfi, G. Casadei, J. S. Dibangoye, and M. Nadri, “Reinforcement learning policies with local lqr guarantees for nonlinear discrete-time systems,” in Proceedings of the 2021 60th IEEE Conference on Decision and Control (CDC).   IEEE, 2021, pp. 2258–2263.
  20. T. Johannink, S. Bahl, A. Nair, J. Luo, A. Kumar, M. Loskyll, J. A. Ojea, E. Solowjow, and S. Levine, “Residual reinforcement learning for robot control,” in Proceedings of the 2019 International Conference on Robotics and Automation (ICRA).   IEEE, 2019, pp. 6023–6029.
  21. R. Zhang, P. Mattsson, and T. Wigren, “Aiding reinforcement learning for set point control,” 22nd World Congress of the International Federation of Automatic Control (IFAC), 2023.
  22. S. Fujimoto, D. Meger, and D. Precup, “Off-policy deep reinforcement learning without exploration,” in Proceedings of the International Conference on Machine Learning (ICML).   PMLR, 2019, pp. 2052–2062.
  23. A. Kumar, A. Zhou, G. Tucker, and S. Levine, “Conservative q-learning for offline reinforcement learning,” Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), vol. 33, pp. 1179–1191, 2020.
  24. M. Zimmer, P. Viappiani, and P. Weng, “Teacher-student framework: a reinforcement learning approach,” in AAMAS Workshop Autonomous Robots and Multirobot Systems, 2014.
  25. R. Agarwal, M. Schwarzer, P. S. Castro, A. C. Courville, and M. Bellemare, “Reincarnating reinforcement learning: Reusing prior computation to accelerate progress,” Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), vol. 35, pp. 28 955–28 971, 2022.
  26. P. Daoudi, B. Robu, C. Prieur, L. Dos Santos, and M. Barlier, “Enhancing reinforcement learning agents with local guides,” in Proceedings of the 22nd International Conference on Autonomous Agents and Multiagent Systems (AMMAS), 2023.
  27. M. Sedighizadeh and A. Rezazadeh, “Adaptive pid controller based on reinforcement learning for wind turbine control,” in Proceedings of world academy of science, engineering and technology, vol. 27.   Citeseer, 2008, pp. 257–262.
  28. I. Carlucho, M. De Paula, and G. G. Acosta, “An adaptive deep reinforcement learning approach for mimo pid control of mobile robots,” ISA transactions, vol. 102, pp. 280–294, 2020.
  29. N. P. Lawrence, G. E. Stewart, P. D. Loewen, M. G. Forbes, J. U. Backstrom, and R. B. Gopaluni, “Optimal pid and antiwindup control design as a reinforcement learning problem,” IFAC-PapersOnLine, vol. 53, no. 2, pp. 236–241, 2020.
  30. D. P. Bertsekas, “Approximate policy iteration: A survey and some new methods,” Journal of Control Theory and Applications, vol. 9, no. 3, pp. 310–335, 2011.
  31. S. Fujimoto, H. Hoof, and D. Meger, “Addressing function approximation error in actor-critic methods,” in Proceedings of the International Conference on Machine Learning.   PMLR, 2018, pp. 1587–1596.
Citations (2)

Summary

We haven't generated a summary for this paper yet.