Multi-agent transformer-accelerated RL for satisfaction of STL specifications (2403.15916v1)
Abstract: One of the main challenges in multi-agent reinforcement learning is scalability as the number of agents increases. This issue is further exacerbated if the problem considered is temporally dependent. State-of-the-art solutions today mainly follow centralized training with decentralized execution paradigm in order to handle the scalability concerns. In this paper, we propose time-dependent multi-agent transformers which can solve the temporally dependent multi-agent problem efficiently with a centralized approach via the use of transformers that proficiently handle the large input. We highlight the efficacy of this method on two problems and use tools from statistics to verify the probability that the trajectories generated under the policy satisfy the task. The experiments show that our approach has superior performance against the literature baseline algorithms in both cases.
- Unifying temporal and structural credit assignment problems. In Autonomous Agents and Multi-Agent Systems Conference, 2004.
- Q-learning for robust satisfaction of signal temporal logic specifications. In 2016 IEEE 55th Conference on Decision and Control (CDC), pages 6565–6570. IEEE, 2016.
- Structured reward shaping using signal temporal logic specifications. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 3481–3486. IEEE, 2019.
- Integrated motion planning and control under metric interval temporal logic specifications. In 2019 18th European Control Conference (ECC), pages 2042–2049. IEEE, 2019.
- Reinforcement learning with probabilistic guarantees for autonomous driving. Workshop on Safety Risk and Uncertainty in Reinforcement Learning, Conference on Uncertainty in Artificial Intelligence (UAI), 2018.
- Decision transformer: Reinforcement learning via sequence modeling. Advances in neural information processing systems, 34:15084–15097, 2021.
- On the use and misuse of absorbing states in multi-agent reinforcement learning. arXiv preprint arXiv:2111.05992, 2021.
- Temporal logic guided safe model-based reinforcement learning: a hybrid systems approach. Nonlinear Analysis: Hybrid Systems, 47:101295, 2023.
- Robust satisfaction of temporal logic over real-valued signals. In International Conference on Formal Modeling and Analysis of Timed Systems, pages 92–106. Springer, 2010.
- Temporal logic motion planning for dynamic robots. Automatica, 45(2):343–352, 2009.
- Cautious reinforcement learning with logical constraints. In Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems, pages 483–491, 2020.
- Updet: Universal multi-agent reinforcement learning via policy decoupling with transformers. In Proceedings of the 9th International Conference on Learning Representations (ICLR 2021), pages 1–15. International Conference on Representation Learning, 2021.
- Safe reinforcement learning using probabilistic shields. In 31st International Conference on Concurrency Theory (CONCUR 2020). Schloss-Dagstuhl-Leibniz Zentrum für Informatik, 2020.
- Temporal logic planning and control of robotic swarms by hierarchical abstractions. IEEE Transactions on Robotics, 23(2):320–330, 2007.
- Settling the variance of multi-agent policy gradients. Advances in Neural Information Processing Systems, 34:13458–13470, 2021.
- Network parameter control in cellular networks through graph-based multi-agent constrained reinforcement learning. In 2023 IEEE 19th International Conference on Automation Science and Engineering (CASE), pages 1–7. IEEE, 2023.
- Learnable fourier features for multi-dimensional spatial positional encoding. Advances in Neural Information Processing Systems, 34:15816–15829, 2021.
- Recurrent neural network controllers for signal temporal logic specifications subject to safety constraints. IEEE Control Systems Letters, 6:91–96, 2021.
- Multi-agent actor-critic for mixed cooperative-competitive environments. Advances in neural information processing systems, 30, 2017.
- Monitoring temporal properties of continuous signals. International Symposium on Formal Techniques in Real-Time and Fault-Tolerant Systems, pages 152–166, 2004.
- Risk-aware motion planning for autonomous vehicles with safety specifications. In 2021 IEEE Intelligent Vehicles Symposium (IV), pages 1016–1023. IEEE, 2021.
- Reactive synthesis from signal temporal logic specifications. In Proceedings of the 18th international conference on hybrid systems: Computation and control, pages 239–248, 2015.
- High-dimensional continuous control using generalized advantage estimation. arXiv preprint arXiv:1506.02438, 2015.
- Stl-based synthesis of feedback controllers using reinforcement learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 15118–15126, 2023.
- Attention is all you need. Advances in neural information processing systems, 30, 2017.
- Multi-agent reinforcement learning is a sequence modeling problem. Advances in Neural Information Processing Systems, 35:16509–16521, 2022.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.