Papers
Topics
Authors
Recent
Search
2000 character limit reached

Finite Horizon Multi-Agent Reinforcement Learning in Solving Optimal Control of State-Dependent Switched Systems

Published 8 Dec 2023 in eess.SY and cs.SY | (2312.04767v3)

Abstract: In this article, a \underline{S}tate-dependent \underline{M}ulti-\underline{A}gent \underline{D}eep \underline{D}eterministic \underline{P}olicy \underline{G}radient (\textbf{SMADDPG}) method is proposed in order to learn an optimal control policy for regionally switched systems. We observe good performance of this method and explain it in a rigorous mathematical language using some simplifying assumptions in order to motivate the ideas and to apply them to some canonical examples. Using reinforcement learning, the performance of the switched learning-based multi-agent method is compared with the vanilla DDPG in two customized demonstrative environments with one and two-dimensional state spaces.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (26)
  1. C. Liu and Z. Gong, “Optimal control of switched systems arising in fermentation processes,” 2014.
  2. S. Yuan, L. Zhang, O. Holub, and S. Baldi, “Switched adaptive control of air handling units with discrete and saturated actuators,” IEEE Control Systems Letters, vol. 2, no. 3, pp. 417–422, 2018.
  3. H.-W. Park, P. Wensing, and S. Kim, “High-speed bounding with the mit cheetah 2: Control design and experiments,” The International Journal of Robotics Research, vol. 36, p. 027836491769424, 03 2017.
  4. M. Egerstedt, “Behavior based robotics using hybrid automata,” in Hybrid Systems: Computation and Control, N. Lynch and B. H. Krogh, Eds.   Berlin, Heidelberg: Springer Berlin Heidelberg, 2000, pp. 103–116.
  5. M. S. Shaikh and P. E. Caines, “On relationships between weierstrass-erdmannn corner condition, snell’s law and the hybrid minimum principle,” in 2007 International Bhurban Conference on Applied Sciences Technology, 2007, pp. 117–122.
  6. D. Stewart and M. Anitescu, “Optimal control of systems with discontinuous differential equations,” Numerische Mathematik, vol. 114, pp. 653–695, 04 2012.
  7. U. Rosolia and A. Ames, “Iterative model predictive control for piecewise systems,” IEEE Control Systems Letters, vol. PP, 04 2021.
  8. A. Nurkanovic and M. Diehl, “Nosnoc: A software package for numerical optimal control of nonsmooth systems,” 03 2022.
  9. V. Azhmyakov, S. Attia, and J. Raisch, “On the maximum principle for impulsive hybrid systems,” vol. 4981, 04 2008, pp. 30–42.
  10. A. Pakniyat and P. Caines, “On the hybrid minimum principle,” 10 2017.
  11. M. S. Shaikh and P. E. Caines, “On the hybrid optimal control problem: Theory and algorithms,” IEEE Transactions on Automatic Control, vol. 52, no. 9, pp. 1587–1603, 2007.
  12. M. Zhou and E. I. Verriest, “Generalized Euler-Lagrange equation: A challenge to Schwartz’s distribution theory,” Proceedings of American Control Conference, 2022.
  13. B. J. Driessen and N. Sadegh, “On the discontinuity of the costates for optimal control problems with coulomb friction,” Optimal Control Applications & Methods, vol. 22, pp. 197–200, 2000.
  14. B. Passenberg and O. Stursberg, “Graph search for optimizing the discrete location sequence in hybrid optimal control,” IFAC Proceedings Volumes, vol. 42, no. 17, pp. 304–309, 2009, 3rd IFAC Conference on Analysis and Design of Hybrid Systems. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1474667015307795
  15. M. Peña, E. F. Camacho, S. Piñón, and R. Carelli, “Model predictive controller for piecewise affine system,” IFAC Proceedings Volumes, vol. 38, no. 1, pp. 141–146, 2005, 16th IFAC World Congress. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1474667016368963
  16. M. Rungger and O. Stursberg, “A numerical method for hybrid optimal control based on dynamic programming,” Nonlinear Analysis: Hybrid Systems, vol. 5, no. 2, pp. 254–274, 2011, special Issue related to IFAC Conference on Analysis and Design of Hybrid Systems (ADHS’09). [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1751570X10000683
  17. ——, “A numerical method for hybrid optimal control based on dynamic programming,” Nonlinear Analysis: Hybrid Systems, vol. 5, no. 2, pp. 254–274, 2011, special Issue related to IFAC Conference on Analysis and Design of Hybrid Systems (ADHS’09). [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1751570X10000683
  18. B. Passenberg, P. E. Caines, M. Sobotka, O. Stursberg, and M. Buss, “The minimum principle for hybrid systems with partitioned state space and unspecified discrete state sequence,” in 49th IEEE Conference on Decision and Control (CDC), 2010, pp. 6666–6673.
  19. A. Schollig, P. E. Caines, M. Egerstedt, and R. Malhame, “A hybrid bellman equation for systems with regional dynamics,” in 2007 46th IEEE Conference on Decision and Control, 2007, pp. 3393–3398.
  20. M. L. Greene, M. Abudia, R. Kamalapurkar, and W. E. Dixon, “Model-based reinforcement learning for optimal feedback control of switched systems,” in 2020 59th IEEE Conference on Decision and Control (CDC), 2020, pp. 162–167.
  21. X. Li, L. Dong, L. Xue, and C. Sun, “Hybrid reinforcement learning for optimal control of non-linear switching system,” IEEE Transactions on Neural Networks and Learning Systems, vol. 34, no. 11, pp. 9161–9170, 2023.
  22. J. Zhao and M. Gan, “Finite-horizon optimal control for continuous-time uncertain nonlinear systems using reinforcement learning,” International Journal of Systems Science, vol. 51, no. 13, pp. 2429–2440, 2020. [Online]. Available: https://doi.org/10.1080/00207721.2020.1797223
  23. C. T. H. Baker and P. A. Radcliffe, “Error bounds for some chebyshev methods of approximation and integration,” SIAM Journal on Numerical Analysis, vol. 7, no. 2, pp. 317–327, 1970. [Online]. Available: http://www.jstor.org/stable/2949465
  24. C. Niu, H. Liao, H. Ma, and H. Wu, “Approximation properties of chebyshev polynomials in the legendre norm,” Mathematics, vol. 9, no. 24, 2021. [Online]. Available: https://www.mdpi.com/2227-7390/9/24/3271
  25. P. Kidger and T. Lyons, “Universal approximation with deep narrow networks,” 2020.
  26. T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, “Continuous control with deep reinforcement learning,” 2019.
Citations (1)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.