SMARLA: A Safety Monitoring Approach for Deep Reinforcement Learning Agents (2308.02594v4)
Abstract: Deep Reinforcement Learning (DRL) has made significant advancements in various fields, such as autonomous driving, healthcare, and robotics, by enabling agents to learn optimal policies through interactions with their environments. However, the application of DRL in safety-critical domains presents challenges, particularly concerning the safety of the learned policies. DRL agents, which are focused on maximizing rewards, may select unsafe actions, leading to safety violations. Runtime safety monitoring is thus essential to ensure the safe operation of these agents, especially in unpredictable and dynamic environments. This paper introduces SMARLA, a black-box safety monitoring approach specifically designed for DRL agents. SMARLA utilizes machine learning to predict safety violations by observing the agent's behavior during execution. The approach is based on Q-values, which reflect the expected reward for taking actions in specific states. SMARLA employs state abstraction to reduce the complexity of the state space, enhancing the predictive capabilities of the monitoring model. Such abstraction enables the early detection of unsafe states, allowing for the implementation of corrective and preventive measures before incidents occur. We quantitatively and qualitatively validated SMARLA on three well-known case studies widely used in DRL research. Empirical results reveal that SMARLA is accurate at predicting safety violations, with a low false positive rate, and can predict violations at an early stage, approximately halfway through the execution of the agent, before violations occur. We also discuss different decision criteria, based on confidence intervals of the predicted violation probabilities, to trigger safety mechanisms aiming at a trade-off between early detection and low false positive rates.
- L. Meng, R. Gorbet, and D. Kulić, “Memory-based deep reinforcement learning for pomdps,” in 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2021, pp. 5619–5626.
- K. Arulkumaran, M. P. Deisenroth, M. Brundage, and A. A. Bharath, “Deep reinforcement learning: A brief survey,” IEEE Signal Processing Magazine, vol. 34, no. 6, pp. 26–38, 2017.
- M. Turchetta, A. Kolobov, S. Shah, A. Krause, and A. Agarwal, “Safe reinforcement learning via curriculum induction,” Advances in Neural Information Processing Systems, vol. 33, pp. 12 151–12 162, 2020.
- G. Dulac-Arnold, N. Levine, D. J. Mankowitz, J. Li, C. Paduraru, S. Gowal, and T. Hester, “Challenges of real-world reinforcement learning: definitions, benchmarks and analysis,” Machine Learning, vol. 110, no. 9, pp. 2419–2468, 2021.
- M. Alshiekh, R. Bloem, R. Ehlers, B. Könighofer, S. Niekum, and U. Topcu, “Safe reinforcement learning via shielding,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32, no. 1, 2018.
- “Road vehicles – functional safety,” ISO 26262:2018, 2018.
- “Road vehicles — safety of the intended functionality,” ISO/PAS 21448:2019, 2019.
- G. Dulac-Arnold, D. Mankowitz, and T. Hester, “Challenges of real-world reinforcement learning,” 2019.
- K. Hansen, A. Ravn, and V. Stavridou, “From safety analysis to software requirements,” IEEE Transactions on Software Engineering, vol. 24, no. 7, pp. 573–584, 1998.
- E. Marchesini, L. Marzari, A. Farinelli, and C. Amato, “Safe deep reinforcement learning by verifying task-level properties,” in Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, 2023, pp. 1466–1475.
- G. Katz, C. Barrett, D. L. Dill, K. Julian, and M. J. Kochenderfer, “Reluplex: An efficient smt solver for verifying deep neural networks,” in Computer Aided Verification: 29th International Conference, CAV 2017, Heidelberg, Germany, July 24-28, 2017, Proceedings, Part I 30. Springer, 2017, pp. 97–117.
- A. Zolfagharian, M. Abdellatif, L. C. Briand, M. Bagherzadeh, and S. Ramesh, “A search-based testing approach for deep reinforcement learning agents,” IEEE Transactions on Software Engineering, 2023.
- Z. Aghababaeyan, M. Abdellatif, L. Briand, S. Ramesh, and M. Bagherzadeh, “Black-box testing of deep neural networks through test case diversity,” IEEE Transactions on Software Engineering, 2023.
- Z. Aghababaeyan, M. Abdellatif, M. Dadkhah, and L. Briand, “Deepgd: A multi-objective black-box test selection approach for deep neural networks,” 2024.
- M. Pecka and T. Svoboda, “Safe exploration techniques for reinforcement learning–an overview,” in Modelling and Simulation for Autonomous Systems: First International Workshop, MESAS 2014, Rome, Italy, May 5-6, 2014, Revised Selected Papers 1. Springer, 2014, pp. 357–375.
- Y. Okawa, T. Sasaki, H. Yanami, and T. Namerikawa, “Safe exploration method for reinforcement learning under existence of disturbance,” 2023.
- I. ElSayed-Aly, S. Bharadwaj, C. Amato, R. Ehlers, U. Topcu, and L. Feng, “Safe multi-agent reinforcement learning via shielding,” arXiv preprint arXiv:2101.11196, 2021.
- B. Könighofer, J. Rudolf, A. Palmisano, M. Tappler, and R. Bloem, “Online shielding for reinforcement learning,” Innovations in Systems and Software Engineering, pp. 1–16, 2022.
- D. Abel, D. Arumugam, L. Lehnert, and M. Littman, “State abstractions for lifelong reinforcement learning,” in Proceedings of the 35th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, J. Dy and A. Krause, Eds., vol. 80. PMLR, 10–15 Jul 2018, pp. 10–19. [Online]. Available: http://proceedings.mlr.press/v80/abel18a.html
- N. Jiang, “Notes on state abstractions,” 2018.
- L. Li, T. J. Walsh, and M. L. Littman, “Towards a unified theory of state abstraction for mdps.” ISAIM, vol. 4, no. 5, p. 9, 2006.
- B. Jang, M. Kim, G. Harerimana, and J. W. Kim, “Q-learning algorithms: A comprehensive classification and applications,” IEEE access, vol. 7, pp. 133 653–133 667, 2019.
- R. Akrour, F. Veiga, J. Peters, and G. Neumann, “Regularizing reinforcement learning with state abstraction,” in 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2018, pp. 534–539.
- S. Hao, L. Li, M. Liu, Y. Zhu, and D. Zhao, “Learning representation with q-irrelevance abstraction for reinforcement learning,” in 2021 11th International Conference on Intelligent Control and Information Processing (ICICIP). IEEE, 2021, pp. 367–373.
- M. Liu, L. Li, S. Hao, Y. Zhu, and D. Zhao, “Soft contrastive learning with q-irrelevance abstraction for reinforcement learning,” IEEE Transactions on Cognitive and Developmental Systems, 2022.
- Y. Tang and S. Agrawal, “Discretizing continuous action space for on-policy optimization,” in Proceedings of the aaai conference on artificial intelligence, vol. 34, no. 04, 2020, pp. 5981–5988.
- A. Tavakoli, F. Pardo, and P. Kormushev, “Action branching architectures for deep reinforcement learning,” in Proceedings of the aaai conference on artificial intelligence, vol. 32, no. 1, 2018.
- P. Swazinna, S. Udluft, D. Hein, and T. Runkler, “Comparing model-free and model-based algorithms for offline reinforcement learning,” arXiv preprint arXiv:2201.05433, 2022.
- “Openai,” https://spinningup.openai.com/en/latest/spinningup/rl_intro2.html, 2018, [Accessed 24 Jan. 2022.].
- L. Breiman, “Random forests,” Mach. Learn., vol. 45, no. 1, p. 5–32, Oct. 2001. [Online]. Available: https://doi.org/10.1023/A:1010933404324
- J. H. Friedman, “Greedy function approximation: a gradient boosting machine,” Annals of statistics, pp. 1189–1232, 2001.
- V. J. Easton and J. H. McColl, “Statistics glossary v1. 1,” 1997.
- V. Behzadan and W. H. Hsu, “Adversarial exploitation of policy imitation,” ArXiv, vol. abs/1906.01121, 2019.
- V. Behzadan and W. Hsu, “Sequential triggers for watermarking of deep reinforcement learning policies,” ArXiv, vol. abs/1906.01126, 2019.
- K. Chen, T. Zhang, X. Xie, and Y. Liu, “Stealing deep reinforcement learning models for fun and profit,” CoRR, vol. abs/2006.05032, 2020. [Online]. Available: https://arxiv.org/abs/2006.05032
- A. Pattanaik, Z. Tang, S. Liu, G. Bommannan, and G. Chowdhary, “Robust deep reinforcement learning with adversarial attacks,” in Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, 2018, pp. 2040–2042.
- A. Hill, A. Raffin, M. Ernestus, A. Gleave, A. Kanervisto, R. Traore, P. Dhariwal, C. Hesse, O. Klimov, A. Nichol, M. Plappert, A. Radford, J. Schulman, S. Sidor, and Y. Wu, “Stable baselines,” https://github.com/hill-a/stable-baselines, 2018.
- V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller, “Playing atari with deep reinforcement learning,” arXiv preprint arXiv:1312.5602, 2013.
- H. van Hasselt, A. Guez, and D. Silver, “Deep reinforcement learning with double q-learning,” 2015.
- Z. Wang, T. Schaul, M. Hessel, H. van Hasselt, M. Lanctot, and N. de Freitas, “Dueling network architectures for deep reinforcement learning,” 2016.
- J. Schaeffer, N. Sturtevant, R. Holte, and K. Anderson, “Coarse-to-fine search techniques,” 2008.
- J. Cao, Z. Guo, Y. Lv, M. Xu, C. Huang, and H. Liang, “Pollution risk prediction for cadmium in soil from an abandoned mine based on random forest model,” International Journal of Environmental Research and Public Health, vol. 20, no. 6, p. 5097, Mar 2023. [Online]. Available: http://dx.doi.org/10.3390/ijerph20065097
- P. S. N. Mindom, A. Nikanjam, F. Khomh, and J. Mullins, “On assessing the safety of reinforcement learning algorithms using formal methods,” in 2021 IEEE 21st International Conference on Software Quality, Reliability and Security (QRS). IEEE, 2021, pp. 260–269.
- S. Gu, L. Yang, Y. Du, G. Chen, F. Walter, J. Wang, Y. Yang, and A. Knoll, “A review of safe reinforcement learning: Methods, theory and applications,” arXiv preprint arXiv:2205.10330, 2022.
- J. Garcıa and F. Fernández, “A comprehensive survey on safe reinforcement learning,” Journal of Machine Learning Research, vol. 16, no. 1, pp. 1437–1480, 2015.
- S. Junges, H. Torfah, and S. A. Seshia, “Runtime monitors for markov decision processes,” in Computer Aided Verification: 33rd International Conference, CAV 2021, Virtual Event, July 20–23, 2021, Proceedings, Part II. Springer, 2021, pp. 553–576.
- H.-n. Wang, N. Liu, Y.-y. Zhang, D.-w. Feng, F. Huang, D.-s. Li, and Y.-m. Zhang, “Deep reinforcement learning: a survey,” Frontiers of Information Technology & Electronic Engineering, vol. 21, no. 12, pp. 1726–1744, 2020.
- F. Semeraro, A. Griffiths, and A. Cangelosi, “Human–robot collaboration and machine learning: A systematic review of recent research,” Robotics and Computer-Integrated Manufacturing, vol. 79, p. 102432, 2023.
- D. Melcer, C. Amato, and S. Tripakis, “Shield decentralization for safe multi-agent reinforcement learning,” Advances in Neural Information Processing Systems, vol. 35, pp. 13 367–13 379, 2022.
- P. Mallozzi, E. Castellano, P. Pelliccione, G. Schneider, and K. Tei, “A runtime monitoring framework to enforce invariants on reinforcement learning agents exploring complex environments,” in 2019 IEEE/ACM 2nd International Workshop on Robotics Software Engineering (RoSE). IEEE, 2019, pp. 5–12.
- C.-H. Cheng, G. Nührenberg, and H. Yasuoka, “Runtime monitoring neuron activation patterns,” in 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 2019, pp. 300–303.
- T. A. Henzinger, A. Lukina, and C. Schilling, “Outside the box: Abstraction-based monitoring of neural networks,” arXiv preprint arXiv:1911.09032, 2019.
- K. Aslansefat, I. Sorokos, D. Whiting, R. Tavakoli Kolagari, and Y. Papadopoulos, “Safeml: safety monitoring of machine learning classifiers through statistical difference measures,” in Model-Based Safety and Assessment: 7th International Symposium, IMBSA 2020, Lisbon, Portugal, September 14–16, 2020, Proceedings 7. Springer, 2020, pp. 197–211.
- R. S. Ferreira, J. Arlat, J. Guiochet, and H. Waeselynck, “Benchmarking safety monitors for image classifiers with machine learning,” in 2021 IEEE 26th Pacific Rim International Symposium on Dependable Computing (PRDC). IEEE, 2021, pp. 7–16.
- A. Stocco, M. Weiss, M. Calzana, and P. Tonella, “Misbehaviour prediction for autonomous driving systems,” in Proceedings of the ACM/IEEE 42nd international conference on software engineering, 2020, pp. 359–371.
- A. Stocco, P. J. Nunes, M. D’Amorim, and P. Tonella, “Thirdeye: Attention maps for safe autonomous driving systems,” in Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering, 2022, pp. 1–12.
- M. Weiss and P. Tonella, “Fail-safe execution of deep learning based systems through uncertainty monitoring,” in 2021 14th IEEE Conference on Software Testing, Verification and Validation (ICST). IEEE, 2021, pp. 24–35.
- Amirhossein Zolfagharian (2 papers)
- Manel Abdellatif (11 papers)
- Lionel C. Briand (29 papers)
- Ramesh S (8 papers)