Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
140 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Safety Guaranteed Robust Multi-Agent Reinforcement Learning with Hierarchical Control for Connected and Automated Vehicles (2309.11057v2)

Published 20 Sep 2023 in cs.RO and cs.MA

Abstract: We address the problem of coordination and control of Connected and Automated Vehicles (CAVs) in the presence of imperfect observations in mixed traffic environment. A commonly used approach is learning-based decision-making, such as reinforcement learning (RL). However, most existing safe RL methods suffer from two limitations: (i) they assume accurate state information, and (ii) safety is generally defined over the expectation of the trajectories. It remains challenging to design optimal coordination between multi-agents while ensuring hard safety constraints under system state uncertainties (e.g., those that arise from noisy sensor measurements, communication, or state estimation methods) at every time step. We propose a safety guaranteed hierarchical coordination and control scheme called Safe-RMM to address the challenge. Specifically, the high-level coordination policy of CAVs in mixed traffic environment is trained by the Robust Multi-Agent Proximal Policy Optimization (RMAPPO) method. Though trained without uncertainty, our method leverages a worst-case Q network to ensure the model's robust performances when state uncertainties are present during testing. The low-level controller is implemented using model predictive control (MPC) with robust Control Barrier Functions (CBFs) to guarantee safety through their forward invariance property. We compare our method with baselines in different road networks in the CARLA simulator. Results show that our method provides best evaluated safety and efficiency in challenging mixed traffic environments with uncertainties.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (28)
  1. D. Martín-Sacristán, S. Roger, D. Garcia-Roger, J. F. Monserrat, P. Spapis, C. Zhou, and A. Kaloxylos, “Low-latency infrastructure-based cellular v2v communications for multi-operator environments with regional split,” IEEE Trans. Intell. Transp. Syst., vol. 22, no. 2, pp. 1052–1067, 2020.
  2. H. Mun, M. Seo, and D. H. Lee, “Secure privacy-preserving v2v communication in 5g-v2x supporting network slicing,” IEEE Trans. Intell. Transp. Syst., 2021.
  3. N. Buckman, A. Pierson, S. Karaman, and D. Rus, “Generating visibility-aware trajectories for cooperative and proactive motion planning,” in ICRA.   IEEE, 2020, pp. 3220–3226.
  4. A. Miller and K. Rim, “Cooperative perception and localization for cooperative driving,” in ICRA 2020.   IEEE, 2020, pp. 1256–1262.
  5. S. Han, H. Wang, S. Su, Y. Shi, and F. Miao, “Stable and efficient shapley value-based reward reallocation for multi-agent reinforcement learning of autonomous vehicles,” in 2022 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2022, pp. 8765–8771.
  6. J. Rios-Torres and A. A. Malikopoulos, “A survey on the coordination of connected and automated vehicles at intersections and merging at highway on-ramps,” IEEE Trans. Intell. Transp. Syst., vol. 18, no. 5, pp. 1066–1077, May 2017.
  7. J. Lee and B. Park, “Development and evaluation of a cooperative vehicle intersection control algorithm under the connected vehicles environment,” IEEE Trans. Intell. Transp. Syst., vol. 13, no. 1, pp. 81–90, March 2012.
  8. Z. Zhang, S. Han, J. Wang, and F. Miao, “Spatial-temporal-aware safe multi-agent reinforcement learning of connected autonomous vehicles in challenging scenarios,” in 2023 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2023, pp. 5574–5580.
  9. Y. Zhu, C. Miao, F. Hajiaghajani, M. Huai, L. Su, and C. Qiao, “Adversarial attacks against lidar semantic segmentation in autonomous driving,” in Proceedings of the 19th ACM Conference on Embedded Networked Sensor Systems, 2021, pp. 329–342.
  10. Y. Zhu, C. Miao, T. Zheng, F. Hajiaghajani, L. Su, and C. Qiao, “Can we use arbitrary objects to attack lidar perception in autonomous driving?” in Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, 2021, pp. 1945–1960.
  11. Y. Liang, Y. Sun, R. Zheng, and F. Huang, “Efficient adversarial training without attacking: Worst-case-aware robust reinforcement learning,” Advances in Neural Information Processing Systems, vol. 35, pp. 22 547–22 561, 2022.
  12. A. D. Ames, X. Xu, J. W. Grizzle, and P. Tabuada, “Control barrier function based quadratic programs for safety critical systems,” IEEE Transactions on Automatic Control, vol. 62, no. 8, pp. 3861–3876, 2017.
  13. R. K. Cosner, A. W. Singletary, A. J. Taylor, T. G. Molnar, K. L. Bouman, and A. D. Ames, “Measurement-robust control barrier functions: Certainty in safety with uncertainty in state,” in 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2021, pp. 6286–6291.
  14. A. Amini, I. Gilitschenski, J. Phillips, J. Moseyko, R. Banerjee, S. Karaman, and D. Rus, “Learning robust control policies for end-to-end autonomous driving from data-driven simulation,” IEEE Robotics and Automation Letters, vol. 5, no. 2, pp. 1143–1150, 2020.
  15. L. Brunke, M. Greeff, A. W. Hall, Z. Yuan, S. Zhou, J. Panerati, and A. P. Schoellig, “Safe learning in robotics: From learning-based control to safe reinforcement learning,” Annual Review of Control, Robotics, and Autonomous Systems, vol. 5, pp. 411–444, 2022.
  16. L. Wen, J. Duan, S. E. Li, S. Xu, and H. Peng, “Safe reinforcement learning for autonomous vehicles through parallel constrained policy optimization,” in 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC).   IEEE, 2020, pp. 1–7.
  17. S. Lu, K. Zhang, T. Chen, T. Başar, and L. Horesh, “Decentralized policy gradient descent ascent for safe multi-agent reinforcement learning,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 10, 2021, pp. 8767–8775.
  18. J. Wang, S. Yang, Z. An, S. Han, Z. Zhang, R. Mangharam, M. Ma, and F. Miao, “Multi-agent reinforcement learning guided by signal temporal logic specifications,” arXiv preprint arXiv:2306.06808, 2023.
  19. S. Han, S. Su, S. He, S. Han, H. Yang, and F. Miao, “What is the solution for state adversarial multi-agent reinforcement learning?” arXiv preprint arXiv:2212.02705, 2022.
  20. S. He, S. Han, S. Su, S. Han, S. Zou, and F. Miao, “Robust multi-agent reinforcement learning with state uncertainty,” Transactions on Machine Learning Research, 2023.
  21. S. He, J. Zeng, B. Zhang, and K. Sreenath, “Rule-based safety-critical control design using control barrier functions with application to autonomous lane change,” in 2021 American Control Conference (ACC).   IEEE, 2021, pp. 178–185.
  22. J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” arXiv preprint arXiv:1707.06347, 2017.
  23. C. Yu, A. Velu, E. Vinitsky, J. Gao, Y. Wang, A. Bayen, and Y. Wu, “The surprising effectiveness of ppo in cooperative multi-agent games,” Advances in Neural Information Processing Systems, vol. 35, pp. 24 611–24 624, 2022.
  24. H. Zhang, H. Chen, C. Xiao, B. Li, M. Liu, D. Boning, and C.-J. Hsieh, “Robust deep reinforcement learning against adversarial perturbations on state observations,” Advances in Neural Information Processing Systems, vol. 33, pp. 21 024–21 037, 2020.
  25. R. Cheng, G. Orosz, R. M. Murray, and J. W. Burdick, “End-to-end safe reinforcement learning through barrier functions for safety-critical continuous control tasks,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, 2019, pp. 3387–3395.
  26. G. Wu and K. Sreenath, “Safety-critical control of a 3d quadrotor with range-limited sensing,” in Dynamic Systems and Control Conference, vol. 50695.   American Society of Mechanical Engineers, 2016, p. V001T05A006.
  27. J. Kong, M. Pfeiffer, G. Schildbach, and F. Borrelli, “Autonomous driving using model predictive control and a kinematic bicycle vehicle model,” in Intelligent Vehicles Symposium, Seoul, Korea, 2015.
  28. A. Dosovitskiy, G. Ros, F. Codevilla, A. Lopez, and V. Koltun, “CARLA: An open urban driving simulator,” in Proceedings of the 1st Annual Conference on Robot Learning, 2017, pp. 1–16.
Citations (1)

Summary

We haven't generated a summary for this paper yet.