Finite Horizon Multi-Agent Reinforcement Learning in Solving Optimal Control of State-Dependent Switched Systems
Abstract: In this article, a \underline{S}tate-dependent \underline{M}ulti-\underline{A}gent \underline{D}eep \underline{D}eterministic \underline{P}olicy \underline{G}radient (\textbf{SMADDPG}) method is proposed in order to learn an optimal control policy for regionally switched systems. We observe good performance of this method and explain it in a rigorous mathematical language using some simplifying assumptions in order to motivate the ideas and to apply them to some canonical examples. Using reinforcement learning, the performance of the switched learning-based multi-agent method is compared with the vanilla DDPG in two customized demonstrative environments with one and two-dimensional state spaces.
- C. Liu and Z. Gong, “Optimal control of switched systems arising in fermentation processes,” 2014.
- S. Yuan, L. Zhang, O. Holub, and S. Baldi, “Switched adaptive control of air handling units with discrete and saturated actuators,” IEEE Control Systems Letters, vol. 2, no. 3, pp. 417–422, 2018.
- H.-W. Park, P. Wensing, and S. Kim, “High-speed bounding with the mit cheetah 2: Control design and experiments,” The International Journal of Robotics Research, vol. 36, p. 027836491769424, 03 2017.
- M. Egerstedt, “Behavior based robotics using hybrid automata,” in Hybrid Systems: Computation and Control, N. Lynch and B. H. Krogh, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2000, pp. 103–116.
- M. S. Shaikh and P. E. Caines, “On relationships between weierstrass-erdmannn corner condition, snell’s law and the hybrid minimum principle,” in 2007 International Bhurban Conference on Applied Sciences Technology, 2007, pp. 117–122.
- D. Stewart and M. Anitescu, “Optimal control of systems with discontinuous differential equations,” Numerische Mathematik, vol. 114, pp. 653–695, 04 2012.
- U. Rosolia and A. Ames, “Iterative model predictive control for piecewise systems,” IEEE Control Systems Letters, vol. PP, 04 2021.
- A. Nurkanovic and M. Diehl, “Nosnoc: A software package for numerical optimal control of nonsmooth systems,” 03 2022.
- V. Azhmyakov, S. Attia, and J. Raisch, “On the maximum principle for impulsive hybrid systems,” vol. 4981, 04 2008, pp. 30–42.
- A. Pakniyat and P. Caines, “On the hybrid minimum principle,” 10 2017.
- M. S. Shaikh and P. E. Caines, “On the hybrid optimal control problem: Theory and algorithms,” IEEE Transactions on Automatic Control, vol. 52, no. 9, pp. 1587–1603, 2007.
- M. Zhou and E. I. Verriest, “Generalized Euler-Lagrange equation: A challenge to Schwartz’s distribution theory,” Proceedings of American Control Conference, 2022.
- B. J. Driessen and N. Sadegh, “On the discontinuity of the costates for optimal control problems with coulomb friction,” Optimal Control Applications & Methods, vol. 22, pp. 197–200, 2000.
- B. Passenberg and O. Stursberg, “Graph search for optimizing the discrete location sequence in hybrid optimal control,” IFAC Proceedings Volumes, vol. 42, no. 17, pp. 304–309, 2009, 3rd IFAC Conference on Analysis and Design of Hybrid Systems. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1474667015307795
- M. Peña, E. F. Camacho, S. Piñón, and R. Carelli, “Model predictive controller for piecewise affine system,” IFAC Proceedings Volumes, vol. 38, no. 1, pp. 141–146, 2005, 16th IFAC World Congress. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1474667016368963
- M. Rungger and O. Stursberg, “A numerical method for hybrid optimal control based on dynamic programming,” Nonlinear Analysis: Hybrid Systems, vol. 5, no. 2, pp. 254–274, 2011, special Issue related to IFAC Conference on Analysis and Design of Hybrid Systems (ADHS’09). [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1751570X10000683
- ——, “A numerical method for hybrid optimal control based on dynamic programming,” Nonlinear Analysis: Hybrid Systems, vol. 5, no. 2, pp. 254–274, 2011, special Issue related to IFAC Conference on Analysis and Design of Hybrid Systems (ADHS’09). [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1751570X10000683
- B. Passenberg, P. E. Caines, M. Sobotka, O. Stursberg, and M. Buss, “The minimum principle for hybrid systems with partitioned state space and unspecified discrete state sequence,” in 49th IEEE Conference on Decision and Control (CDC), 2010, pp. 6666–6673.
- A. Schollig, P. E. Caines, M. Egerstedt, and R. Malhame, “A hybrid bellman equation for systems with regional dynamics,” in 2007 46th IEEE Conference on Decision and Control, 2007, pp. 3393–3398.
- M. L. Greene, M. Abudia, R. Kamalapurkar, and W. E. Dixon, “Model-based reinforcement learning for optimal feedback control of switched systems,” in 2020 59th IEEE Conference on Decision and Control (CDC), 2020, pp. 162–167.
- X. Li, L. Dong, L. Xue, and C. Sun, “Hybrid reinforcement learning for optimal control of non-linear switching system,” IEEE Transactions on Neural Networks and Learning Systems, vol. 34, no. 11, pp. 9161–9170, 2023.
- J. Zhao and M. Gan, “Finite-horizon optimal control for continuous-time uncertain nonlinear systems using reinforcement learning,” International Journal of Systems Science, vol. 51, no. 13, pp. 2429–2440, 2020. [Online]. Available: https://doi.org/10.1080/00207721.2020.1797223
- C. T. H. Baker and P. A. Radcliffe, “Error bounds for some chebyshev methods of approximation and integration,” SIAM Journal on Numerical Analysis, vol. 7, no. 2, pp. 317–327, 1970. [Online]. Available: http://www.jstor.org/stable/2949465
- C. Niu, H. Liao, H. Ma, and H. Wu, “Approximation properties of chebyshev polynomials in the legendre norm,” Mathematics, vol. 9, no. 24, 2021. [Online]. Available: https://www.mdpi.com/2227-7390/9/24/3271
- P. Kidger and T. Lyons, “Universal approximation with deep narrow networks,” 2020.
- T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, “Continuous control with deep reinforcement learning,” 2019.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.