Learning Hierarchical Control Systems for Autonomous Systems with Energy Constraints (2403.14536v2)
Abstract: This paper focuses on the design of hierarchical control architectures for autonomous systems with energy constraints. We focus on systems where energy storage limitations and slow recharge rates drastically affect the way the autonomous systems are operated. Using examples from space robotics and public transportation, we motivate the need for formally designed learning hierarchical control systems. We propose a learning control architecture which incorporates learning mechanisms at various levels of the control hierarchy to improve performance and resource utilization. The proposed hierarchical control scheme relies on high-level energy-aware task planning and assignment, complemented by a low-level predictive control mechanism responsible for the autonomous execution of tasks, including motion control and energy management. Simulation examples show the benefits and the limitations of the proposed architecture when learning is used to obtain a more energy-efficient task allocation.
- https://mars.nasa.gov/mars2020/spacecraft/rover/electrical power/, 2022.
- D. Gaines, G. Doran, H. Justice, G. Rabideau, S. Schaffer, V. Verma, K. Wagstaff, A. Vasavada, W. Huffman, R. Anderson, R. Mackey, and T. Estlin, “Productivity Challenges for Mars Rover Operations: A Case Study of Mars Science Laboratory Operations.”
- D. Gaines, R. Anderson, G. Rabideau, A. Vasavada, V. Verma, T. Estlin, L. Fesq, M. Ingham, M. Maimone, I. Nesnas, G. Doran, W. Huffman, H. Justice, and R. Mackey, “Productivity Challenges for Mars Rover Operations.” [Online]. Available: https://ntrs.nasa.gov/citations/20190025594
- J. Guanetti, Y. Kim, X. Shen, J. Donham, S. Alexander, B. Wootton, and F. Borrelli, “Increasing electric vehicles utilization in transit fleets using learning, predictions, optimization, and automation,” 2023.
- K. Erol, J. A. Hendler, and D. S. Nau, “Umcp: A sound and complete procedure for hierarchical task-network planning,” in International Conference on Artificial Intelligence Planning Systems, 1994.
- D. Schreiber, “Lilotane: A lifted sat-based approach to hierarchical planning,” J. Artif. Int. Res., vol. 70, p. 1117–1181, may 2021. [Online]. Available: https://doi.org/10.1613/jair.1.12520
- D. Nau, Y. Bansod, S. Patra, M. Roberts, and R. Li, “Gtpyhop: A hierarchical goal+ task planner implemented in python,” HPlan 2021, p. 21, 2021.
- N. Nejati, P. Langley, and T. Konik, “Learning hierarchical task networks by observation,” in Proceedings of the 23rd International Conference on Machine Learning, ser. ICML ’06. New York, NY, USA: Association for Computing Machinery, 2006, p. 665–672. [Online]. Available: https://doi.org/10.1145/1143844.1143928
- K. Chen, N. S. Srikanth, D. Kent, H. Ravichandar, and S. Chernova, “Learning hierarchical task networks with preferences from unannotated demonstrations,” in Proceedings of the 2020 Conference on Robot Learning, ser. Proceedings of Machine Learning Research, J. Kober, F. Ramos, and C. Tomlin, Eds., vol. 155. PMLR, 16–18 Nov 2021, pp. 1572–1581. [Online]. Available: https://proceedings.mlr.press/v155/chen21d.html
- C. Hogg, U. Kuter, and H. Muñoz Avila, “Learning hierarchical task networks for nondeterministic planning domains,” in Proceedings of the 21st International Joint Conference on Artificial Intelligence, ser. IJCAI’09. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 2009, p. 1708–1714.
- H. H. Zhuo, H. Muñoz-Avila, and Q. Yang, “Learning hierarchical task network domains from partially observed plan traces,” vol. 212, pp. 134–157. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0004370214000447
- B. Hayes and B. Scassellati, “Autonomously constructing hierarchical task networks for planning and human-robot collaboration,” in 2016 IEEE International Conference on Robotics and Automation (ICRA). IEEE Press, pp. 5469–5476. [Online]. Available: https://doi.org/10.1109/ICRA.2016.7487760
- N. Li, W. Cushing, S. Kambhampati, and S. Yoon, “Learning Probabilistic Hierarchical Task Networks as Probabilistic Context-Free Grammars to Capture User Preferences,” vol. 5, no. 2, pp. 29:1–29:32. [Online]. Available: https://doi.org/10.1145/2589481
- D. Hafner, K.-H. Lee, I. Fischer, and P. Abbeel, “Deep Hierarchical Planning from Pixels.” [Online]. Available: http://arxiv.org/abs/2206.04114
- A. G. Barto and S. Mahadevan, “Recent advances in hierarchical reinforcement learning,” Discrete Event Dynamic Systems, vol. 13, pp. 41–77, 2003. [Online]. Available: https://api.semanticscholar.org/CorpusID:386824
- J. Albus, “Outline for a theory of intelligence,” IEEE Transactions on Systems, Man, and Cybernetics, vol. 21, no. 3, pp. 473–509, 1991.
- J. Albus, H.-M. Huang, E. Messina, K. Murphy, M. Juberts, A. Lacaze, S. Balakirsky, M. Shneier, T. Hong, H. Scott, F. Proctor, W. Shackleford, J. Michaloski, A. Wavering, T. Kramer, N. Dagalakis, W. Rippey, K. Stouffer, and S. Legowik, “4d/rcs version 2.0: A reference model architecture for unmanned vehicle systems,” 2002-08-22 00:08:00 2002.
- R. Alami, R. Chatila, S. Fleury, M. Ghallab, and F. Ingrand, “An architecture for autonomy,” The International Journal of Robotics Research, vol. 17, no. 4, pp. 315–337, 1998. [Online]. Available: https://doi.org/10.1177/027836499801700402
- R. Volpe, I. Nesnas, T. Estlin, D. Mutz, R. Petras, and H. Das, “The claraty architecture for robotic autonomy,” in 2001 IEEE Aerospace Conference Proceedings (Cat. No.01TH8542), vol. 1, 2001, pp. 1/121–1/132 vol.1.
- K. Iagnemma and M. Buehler, “Editorial for journal of field robotics—special issue on the darpa grand challenge: Editorial,” J. Robot. Syst., vol. 23, no. 9, p. 655–656, sep 2006.
- A. Pinto, “An open and modular architecture for autonomous and intelligent systems,” in 2019 IEEE International Conference on Embedded Software and Systems (ICESS), 2019, pp. 1–8.
- S. Gu, L. Yang, Y. Du, G. Chen, F. Walter, J. Wang, Y. Yang, and A. Knoll, “A review of safe reinforcement learning: Methods, theory and applications,” 2023.
- R. Ivanov, K. Jothimurugan, S. Hsu, S. Vaidya, R. Alur, and O. Bastani, “Compositional learning and verification of neural network controllers,” ACM Trans. Embed. Comput. Syst., vol. 20, no. 5s, sep 2021. [Online]. Available: https://doi.org/10.1145/3477023
- K. Srinivasan, B. Eysenbach, S. Ha, J. Tan, and C. Finn, “Learning to be safe: Deep rl with a safety critic,” 2020.
- B. Thananjeyan, A. Balakrishna, S. Nair, M. Luo, K. Srinivasan, M. Hwang, J. E. Gonzalez, J. Ibarz, C. Finn, and K. Goldberg, “Recovery rl: Safe reinforcement learning with learned recovery zones,” 2021.
- H. Bharadhwaj, A. Kumar, N. Rhinehart, S. Levine, F. Shkurti, and A. Garg, “Conservative safety critics for exploration,” 2021.
- Y. Chow, O. Nachum, E. Duenez-Guzman, and M. Ghavamzadeh, “A lyapunov-based approach to safe reinforcement learning,” 2018.
- Y. Chow, O. Nachum, A. Faust, E. Duenez-Guzman, and M. Ghavamzadeh, “Lyapunov-based safe policy optimization for continuous control,” 2019.
- M. Salamati, S. Soudjani, and R. Majumdar, “A lyapunov approach for time bounded reachability of ctmcs and ctmdps,” 2020.
- R. Cheng, G. Orosz, R. M. Murray, and J. W. Burdick, “End-to-end safe reinforcement learning through barrier functions for safety-critical continuous control tasks,” 2019.
- X. Li and C. Belta, “Temporal logic guided safe reinforcement learning using control barrier functions,” 2019.
- Y. Emam, G. Notomista, P. Glotfelter, Z. Kira, and M. Egerstedt, “Safe reinforcement learning using robust control barrier functions,” 2022.
- B. Gangopadhyay, P. Dasgupta, and S. Dey, “Safe and stable rl (s2rl) driving policies using control barrier and control lyapunov functions,” IEEE Transactions on Intelligent Vehicles, vol. 8, no. 2, pp. 1889–1899, 2023.
- B. Lütjens, M. Everett, and J. P. How, “Certified adversarial robustness for deep reinforcement learning,” 2020.
- L. Pinto, J. Davidson, R. Sukthankar, and A. Gupta, “Robust adversarial reinforcement learning,” 2017.
- F. Sadeghi and S. Levine, “Cad2rl: Real single-image flight without a single real image,” 2017.
- A. Hock and A. P. Schoellig, “Distributed iterative learning control for multi-agent systems,” Autonomous Robots, vol. 43, no. 8, pp. 1989–2010, 2019.
- U. Rosolia and F. Borrelli, “Learning model predictive control for iterative tasks. a data-driven control framework,” IEEE Transactions on Automatic Control, vol. 63, no. 7, pp. 1883–1896, 2018.
- ——, “Learning how to autonomously race a car: a predictive control approach,” 2019.
- M. Chaal, O. Valdez Banda, J. Glomsrud, S. Basnet, S. Hirdaris, and P. Kujala, “A framework to model the stpa hierarchical control structure of an autonomous ship,” Safety Science, vol. 132, p. 104939, 12 2020.
- N. Dinh, M. Sualeh, D. Kim, and G.-W. Kim, “A hierarchical control system for autonomous driving towards urban challenges,” Applied Sciences, vol. 10, p. 26, 05 2020.
- S. Dixit, S. Fallah, U. Montanaro, M. Dianati, A. Stevens, F. Mccullough, and A. Mouzakitis, “Trajectory planning and tracking for autonomous overtaking: State-of-the-art and future prospects,” Annual Reviews in Control, vol. 45, pp. 76–86, 2018. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S136757881730130X
- A. Montoya, C. Guéret, J. E. Mendoza, and J. G. Villegas, “The electric vehicle routing problem with nonlinear charging function,” Transportation Research Part B: Methodological, vol. 103, pp. 87–110, 2017, green Urban Transportation. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0191261516304556
- M. Bujarbaruah, “Robust model predictive control with data-driven learning,” Ph.D. dissertation, UC Berkeley, 2022.
- M. Pustilnik and F. Borrelli, “Clustering heuristics for robust energy capacitated vehicle routing problem (ecvrp),” 2024.